Basic Statistical Tools

Basic Statistical Tools
Presented by
Kh. Farjana Urmi

Sr. Quality Control Officer
ACI HealthCare Limited, Kh. Farjana Urmi
INTRODUCTION
This discussion session provides information regarding
Acceptable practices for the analysis and consistent interpretation of data
obtained from chemical and other analyses.
Basic statistical approaches for evaluating Quality of the data
The treatment of outliers and comparison of analytical methods
INTRODUCTION
Session 1:Prerequisite laboratory practices and principles
Sound Record Keeping
Sampling considerations
Use of reference standards
System performance verification
Method validation and verification
Session 2: Measurement Principles and Variation

Sources and types of error
Normal distribution of analytical data
Standard deviation, mean, and averaging
Introduction to confidence intervals
INTRODUCTION
Basic Tools of Quality
Ishikawa diagram
Control chart
Pareto chart
Scatter plot
Session 3: Comparison of Results

Comparison of Analytical Methods
Accuracy and precision
The role of t and ANOVA
Session 1:Prerequisite
laboratory practices and
principles
Prerequisite laboratory practices and principles

The sound application of statistical principles to laboratory data
(Information ,facts /figures) requires the assumption that such data have
been collected in a traceable (i.e., documented) and unbiased manner. To
ensure this, the following practices are beneficial:
Sound Record Keeping:
Laboratory records are maintained with sufficient detail, so that other
equally qualified analysts can reconstruct the experimental conditions and
review the results obtained. When collecting data, the data should generally
be obtained with more decimal places than the specification requires and
rounded only after final calculations are completed.
Significant figures, Addition &

Subtraction
Significant figures,
Addition & Subtraction
What is 16.874 + 2.6 ?
What is 16.874 2.6 ?

Sampling considerations:
A portion of a material collected according to a defined sampling
procedure. The size of any sample should be sufficient to allow all
anticipated test procedures to be carried out, including all repetitions
and retention samples.
Types of Sampling:
1. Nonrandom or convenience sampling
2. Simple random sampling
3. Systematic random sampling
4. Stratified random sample

Types of Sampling :
Types
Description
Nonrandom /conveni Risks the possibility that the estimates will be biased.
ence sampling
Simple random
sampling
A process in which every unit of the population has an

equal chance of appearing in the sample.
Systematic random
sampling
A unit is randomly selected from the production process at

systematically selected times or locations (e.g., sampling
every 30 minutes from the units produced within the 12hour process) to ensure that units taken throughout the
entire manufacturing process are included in the sample
Stratified random
sample
A stratified random sample, which randomly samples an

equal number of sample from each of the same operation
in multiple machines.
System Performance Verification

Verifying an acceptable level of performance for an analytical system in routine
or continuous use can be a valuable practice.
The performance of a system is verified by:
01. Analyzing a control sample at appropriate intervals.
02. Variation among the standards ( Calibration curve).
03. System suitability test, etc.
Trend analysis of performance verification:
Trend analysis on performance should be performed at regular intervals to
evaluate the need to optimize the analytical procedure or to revalidate all or a
part of the analytical procedure. If an analytical procedure can only meet the
established system suitability requirements with repeated adjustments to the
operating conditions stated in the analytical procedure, the analytical procedure
should be reevaluated, revalidated, or amended, as appropriate.
Validation and Verification
Difference between verification and

validation
Limited,
Farjana
Urmi
ACI HealthCareACI
Limited,
Kh.Kh.
Farjana
Urmi
Method Validation
All methods are appropriately validated as specified under Validation
of Compendial Methods 1225 .
Methods published in the USPNF have been validated and meet

the Current Good Manufacturing Practices regulatory requirement for
validation as established in the Code of Federal Regulations.
A validated method may be used to test a new formulation (such as
a new product, dosage form, or process intermediate) only after
confirming that the new formulation does not interfere with the accuracy,
linearity, or precision of the method.
It may not be assumed that a validated method could correctly

measure the active ingredient in a formulation that is different from
that used in establishing the original validity of the method.
Session 2: Measurement Principles

and Variation
ACI HealthCare
HealthCare Limited,
Limited, Kh.
Kh. Farjana
Farjana Urmi
Urmi
ACI
Measurement Principles and Variation

True Value:
This is an ideal concept which can not be achieved.
Accepted True Value:
The value approximating the true value, the difference between the
two values is negligible or with in acceptable limit.
Error and types of error:
Error is the collective noun for any departure of the result from the "true"
value*. Analytical errors can be:
1. Random or unpredictable errors are caused by the natural uncertainty
that occurs with any measurement. Random errors cant be corrected.

2. Systematic or predictable regular errors are reproducible and cause a bias in
the same direction for each measurement.
For example , a poorly trained operator that consistently makes the same
mistake will cause systematic error. Systematic error can be corrected.
Mean:
It describe the Central tendency of the data set, it identifies the target value. The
average of a set of n data xi:

Standard Deviation:
The standard deviation measures a tests precision or how close individual
measurements are to each other.
The standard deviation, denoted as s.d. or S, is calculated as
Coefficient of variation (CV) & Relative standard deviation (RSD) :
A further measure of precision, known as the Relative Standard Deviation
(R.S.D.), is given by:
This measure is often expressed as a percentage, known as the coefficient of
variation
Accuracy refers to how closely a measurement matches

the true or actual values
To be accurate only requires the true value (bulls eye) &

one measurement (for the arrow to hit the target)
Highly accurate data can be costly and difficult to
acquire
Precision refers to the reproducibility of the measurement

and exactness of description in a number.
To decide on precision, you need several measurements

(notice multiple arrow holes), and you do not need to
know the true value (none of the values are close to the
target but all the holes are close together.)
In order to be accurate and precise, one must pay close attention

to detail to receive the same results every time as well as hit the
target.
Comparing Accuracy & Precision
Accurate &
Precise
Precise but
not
Accurate
Accurate but
not
Precise
Not Accurate &

not Precise

Consider the data (in cm) for the length of an object as measured
by three students. The length is known to be 14.5 cm. Which
student had the most precise work, and which student had the
most accurate work?
Trial Trial 2 Trial 3 Trial 4 Trial 5
1
Student
A
14.8
14.7
14.8
14.7
14.8
Student
B
14.7
14.2
14.6
14.6
14.8
Student
C
14.4
14.4
14.5
14.4
14.5

Most precise: Student A (0.1 cm difference)
Most accurate: Student C (2 were true value, rest within 0.1 cm)
Trial
1
Trial 2
Trial 3
Trial 4
Trial 5
Student
A
14.8
14.7
14.8
14.7
14.8
Student
B
14.7
14.2
14.6
14.6
14.8
Student
C
14.4
14.4
14.5
14.4
14.5
OUTLYING RESULTS
Outliers: occasionally, observed analytical results are very different from those
expected. Aberrant, anomalous, contaminated, discordant, spurious, suspicious or
wild observations; and flyers, rogues, and mavericks are properly called outlying
results. Like all laboratory results, these outliers must be documented, interpreted,
and managed. Such results may be accurate measurements of the entity being
measured, but are very different from what is expected.
Outliers, in statistics, refer to relatively small or large values which are considered
to be different from, and not belong to, the main body of data. The problem of what
to do with outliers is a constant dilemma facing research scientists. If the cause of
an outlier is known, resulting from an obvious error, for example, the value can be
omitted from the analysis and tabulation of the data.
OUTLYING RESULTS
Factors to be considered when investigating an outlying result include
but are not limited to
Human error, instrumentation error, calculation error, and product or component
deficiency. If an assignable cause that is not related to a product or component
deficiency can be identified, then retesting may be performed on the same
sample, if possible, or on a new sample.
The precision and accuracy of the method, the Reference Standard, process
trends, and the specification limits should all be examined. Data may be
invalidated, based on this documented investigation, and eliminated from
subsequent calculations.
ACI
Limited,
Kh.
Farjana
Urmi
ACI HealthCare
Limited,
Kh.
Farjana
Urmi
OUTLYING RESULTS
Outlier identification is the use of statistical significance tests to confirm that
the values are inconsistent with the known or assumed statistical model.
When used appropriately, outlier tests are valuable tools for pharmaceutical
laboratories. Several tests exist for detecting outliers. Examples illustrating three of
these procedures, the Extreme Studentized Deviate (ESD) Test, Dixon's Test, and
Hampel's Rule.
Outlier rejection is the actual removal of the identified outlier from the data set.
However, an outlier test cannot be the sole means for removing an outlying result
from the laboratory data.
All data, especially outliers, should be kept for future review. Unusual data, when
seen in the context of other historical data, are often not unusual after all but reflect
the influences of additional sources of variation.
OUTLYING RESULTS
An outlier test may be useful as part of the evaluation of the significance of that
result, along with other data. Outlier tests have no applicability in cases where the
variability in the product is what is being assessed, such as content uniformity,
dissolution, or release-rate determination. In these applications, a value
determined to be an outlier may in fact be an accurate result of a non uniform
product.
In summary, the rejection or retention of an apparent outlier can be a serious
source of bias. An outlier test can never take the place of a thorough laboratory
investigation. Rather, it is performed only when the investigation is inconclusive
and no deviations in the manufacture or testing of the product were noted.
OUTLYING RESULTS
Given the following set of 10 measurements: 100.0, 100.1, 100.3, 100.0, 99.7,
99.9, 100.2, 99.5, 100.0, and 95.7 (mean = 99.5, standard deviation = 1.369) are
there any outliers?
Dixon-Type Tests
Stage 1 (n= 10)The results are ordered on the basis of their magnitude (i.e., Xn
is the largest observation, Xn1 is the second largest, etc., and X1 is the smallest
observation). Dixon's Test has different ratios based on the sample size (in this
example, with n = 10), to declare X1 an outlier, the following ratio, r11, is calculated
by the formula:
OUTLYING RESULTS
If, r11 > Qtable, where Qtable is a reference value corresponding to the sample
size and confidence level, then reject the questionable point. Note that only
one point may be rejected from a data set using this test.
ACI Limited,
ACI HealthCare
Limited,Kh.
Kh.Farjana
FarjanaUrmi
Urmi
Ishikawa diagram
Ishikawa diagrams (also called fishbone diagrams, cause-and-effect diagrams) are causal
diagrams created by Kaoru Ishikawa (1968) that show the causes of a specific event. Common
uses of the Ishikawa diagram are product design and quality defect prevention to identify potential
factors causing an overall effect. Each cause or reason for imperfection is a source of variation.
Causes are usually grouped into major categories to identify these sources of variation.

Control Sample: A control sample is defined as a homogeneous and
stable sample that is tested at specific intervals sufficient to monitor the
performance of the method for which it was established. Test data from a
control sample can be used to monitor the method variability or be used as
part of system suitability requirements. The control sample should be
essentially the same as the test sample and should be treated similarly
whenever possible. A control chart can be constructed and used to monitor
the method performance.
A control chart (also called process chart or quality control chart) is

a graph that shows whether a sample of data falls within the common
or normal range of variation. A control chart has upper and lower
control limits that separate common from assignable causes of
variation. The common range of variation is defined by the use of
control chart limits. We say that a process is out of control when a
plot of data reveals that one or more samples fall outside the control
limits.
ACI
Limited,
Kh.
Farjana
Urmi
ACI HealthCare
Limited,
Kh.
Farjana
Urmi
To construct the upper and lower control limits of the chart, we use the following
formulas:
Where, x= mean of the sample means or a target value set for the process
z = number of normal standard deviations
x = standard deviation of the sample means
= / n
= population standard deviation
n = sample size
A Pareto chart, named after Vilfredo Pareto, is a type of chart that contains
both bars and a line graph, where individual values are represented in descending
order by bars, and the cumulative total is represented by the line.
The left vertical axis is the frequency of occurrence, but it can alternatively
represent cost or another important unit of measure. The right vertical axis is the
cumulative percentage of the total number of occurrences, total cost, or total of the
particular unit of measure. Because the reasons are in decreasing order, the
cumulative function is a concave function. To take the example below, in order to
lower the amount of late arrivals by 78%, it is sufficient to solve the first three
issues.
A scatter graph is a type of mathematical diagram using Cartesian coordinates to

display values for typically two variables for a set of data. If the points are colorcoded you can increase the number of displayed variables to three.
The data is displayed as a collection of points, each having the value of one
variable determining the position on the horizontal axis and the value of the other
variable determining the position on the vertical axis. This kind of plot is also called
a scatter chart, scatter gram, scatter diagram, or scatter graph.
Session 3: Comparison of Results
COMPARISON OF ANALYTICAL METHODS

Accuracy:
Precision:
How close you are to the actual
How finely tuned your
value
measurements are or how
Depends on the person measuring
close they can be to each
other
Calculated by the formula:
Depends on the measuring
% Error = (YV AV) x 100 AV
tool
Where: YV is YOUR measured Value
Determined by the number
& AV is the Accepted Value
of significant digits

Accuracy & Precision may be demonstrated by shooting at a target.
Accuracy is represented by hitting the bulls eye (the accepted value)
Precision is represented by a tight grouping of shots (they are finely tuned)
Accuracy - Calculating % Error
How Close Are You to the Accepted Value (Bulls Eye)
If a student measured the room width at 8.46 m and the accepted value was
9.45 m what was their accuracy?
Using the formula:
% error = (YV AV) x 100 AV
Where YV is the students measured value &
AV is the accepted value

Accuracy - Calculating % Error ,How Close Are You to the Accepted Value
(Bulls Eye)
If a student measured the room width at 8.46 m and the accepted value was 9.45 m what was
their accuracy?
Using the formula: % error = (YV AV) x 100 AV
Where YV is the students measured value & AV is the accepted value
Since YV = 8.46 m, AV = 9.45 m
% Error = (8.46 m 9.45 m) x 100 9.45 m
= -0.99 m x 100 9.45 m
= -99 m
9.45 m
= -10.5 %
Note that the meter unit cancels during the division & the unit is %. The (-) shows that YV
was low
The student was off by almost 11% & must remeasure
t-Test
Looks at differences between two groups on some variable of interest
Ex: Do males and females differ in the amount of hours they spend
shopping in a given month?
ANOVA
When results of laboratories or methods are compared where more than one
factor can be of influence and must be distinguished from random effects, then
ANOVA is a powerful statistical tool to be used. Examples of such factors are:
different analysts, samples with different pre-treatments, different analyte levels,
different methods within one of the laboratories). Most statistical packages for the
PC can perform this analysis
What is ANOVA?
A statistical method for testing whether two or more dependent
variable means are equal (i.e., the probability that any differences in
means across several groups are due solely to sampling error).
Variables in ANOVA (Analysis of Variance):
Dependent variable is metric.
Independent variable(s) is nominal with two or more levels
also called treatment, manipulation, or factor.
One-way ANOVA: only one independent variable with two or more
levels.
Two-way ANOVA: two independent variables each with two or
more levels.
Measurement is the first step that leads to control and eventually to

improvement. If you cant measure something, you cant
understand it. If you cant understand it, you cant control it. If you
cant control it, you cant improve it.
Thank you

Basic Statistical Tools

Uploaded by

Copyright:

Available Formats

Basic Statistical Tools

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Basic Statistical Tools

Uploaded by

Copyright:

Available Formats

Basic Statistical Tools

Kh. Farjana Urmi

ACI HealthCare Limited, Kh. Farjana Urmi

ACI HealthCare Limited, Kh. Farjana Urmi

Use of reference standards

System performance verification

Method validation and verification

Session 2: Measurement Principles and Variation

Normal distribution of analytical data

Standard deviation, mean, and averaging

Introduction to confidence intervals

ACI HealthCare Limited, Kh. Farjana Urmi

Session 3: Comparison of Results

Accuracy and precision

The role of t and ANOVA

ACI HealthCare Limited, Kh. Farjana Urmi

ACI HealthCare Limited, Kh. Farjana Urmi

Prerequisite laboratory practices and principles

ACI HealthCare Limited, Kh. Farjana Urmi

Significant figures, Addition &

ACI HealthCare Limited, Kh. Farjana Urmi

What is 16.874 + 2.6 ?

What is 16.874 2.6 ?

ACI HealthCare Limited, Kh. Farjana Urmi

Prerequisite laboratory practices and principles

Prerequisite laboratory practices and principles

A process in which every unit of the population has an

A unit is randomly selected from the production process at

A stratified random sample, which randomly samples an

ACI HealthCare Limited, Kh. Farjana Urmi

System Performance Verification

Validation and Verification

Difference between verification and

Methods published in the USPNF have been validated and meet

It may not be assumed that a validated method could correctly

ACI HealthCare Limited, Kh. Farjana Urmi

Session 2: Measurement Principles

Measurement Principles and Variation

Measurement Principles and Variation

ACI HealthCare Limited, Kh. Farjana Urmi

Measurement Principles and Variation

ACI HealthCare Limited, Kh. Farjana Urmi

Measurement Principles and Variation

Accuracy refers to how closely a measurement matches

To be accurate only requires the true value (bulls eye) &

ACI HealthCare Limited, Kh. Farjana Urmi

Measurement Principles and Variation

Precision refers to the reproducibility of the measurement

To decide on precision, you need several measurements

Measurement Principles and Variation

In order to be accurate and precise, one must pay close attention

ACI HealthCare Limited, Kh. Farjana Urmi

Measurement Principles and Variation

Comparing Accuracy & Precision

ACI HealthCare Limited, Kh. Farjana Urmi

Not Accurate &

Measurement Principles and Variation

ACI HealthCare Limited, Kh. Farjana Urmi

Measurement Principles and Variation