Example Table
Example Table (Agilent MassHunter Export format)
eCerto is an online tool written in R/Shiny to facilitate the statistical evaluation of data collected in connection with the production of certified reference materials (CRMs) at Bundesanstalt für Materialforschung und -prüfung (BAM).
The standard workflow consists of a certification trial in combination with homogeneity and stability assays. Data from all of these assays need to be collected, combined and analyzed to report certified values of measured properties or entities of a reference material.
The tool provides some example data to test the functionality on the Start
page. Here, previous analyses can be read from a backup file and will contain all input data and previously set parameters.
A new analysis can be started by importing data from Excel but should be stored after statistical analyes together with your user name and a trial ID as an RData-file data (Backup
).
The statistics for reference material certification with eCerto is implemented according to ISO GUIDE 35:2017.
To make the calculations transparent for the user and facilitate using this tool, the relevant section of the documentation (from page Help
) can be obtained in a modal/pop-up window by clicking on the respective links.
Links should be rendered in blue color on your screen. Often, table or figure captions are used for this purpose, so watch out for help-links close to the area you are working on.
You'll find a working example on page Start
(--> Load Test Data). To upload your own data, tables have to be prepared in Excel using specific cell formats, column names and column orders. To learn about these formats you can click on the word example right over the select box. Magic: the help section opened by this link is dependent on the value selected in the box.
Note! To close a help window you need to click anywhere outside of it with your mouse.
To provide a consistent layout, input sections (buttons, select boxes and others) are usually indicated by a grey backgound. Output sections (tables and figures) should render on a white backgound.
To process a new analysis, please collect all data which should be returned in pre-defined Excel-Files by your certification trial partners and stored within one folder. Pre-defined means that the analytical results are at the same position within each file (table name, row and column range) and that exactly one file per laboratory exists.
Note! Please select the range without a header as shown in the figure below. eCerto assumes the first and second column of the specified range to contain analyte names and units respectively and replicate measurements of this analyte in all remaining columns.
File names will be stored for reference but imported labs will be consecutively numbered. In the example above you would import data from sheet 1 of each Excel file located in Range A7:H9, containing 6 replicate measurements of 3 different analytes (column A) measured in units as specified in column B.
Note! Currently, eCerto needs to load all Excel result files in one step, i.e. you can not add additional files later on but would have to start all over again. However, this is reasonable as statistics will change upon addition of new values which in turn might require adjustments in rounding precision or filtered labs.
For analytes under inspection (i.e. which are selected in Tab.C3), several analyte specific parameters can be set and will be stored also in the backup file.
pooling
means that calculations in the material table (mean
, sd
and uncertainty columns) are not based on the lab means, but rather on all measured values. This is justified when the between lab variance is smaller than the within lab variance (check if the ANOVA P-value is insignificant in Tab.C2, if so pooling might be allowed).
Selecting this option will also affect n
in Tab.C3 being either the number of included labs or the number of finite data points from these labs.
Individual Samples can be filtered by ID (select show IDs
in Fig.C1 to identify outlying IDs). These samples will not be used in any calculations. They are removed from downstream processing right after import. However, information of this filtering step is preserved and included in reports.
Labs can be excluded by Lab-ID. Please note, that this is done after statistical testing for mean and variance outliers (in Tab.C1 and Tab.C2). Filtered labs are depicted in Fig.C1 (using grey color), but excluded from calculating the certified value of the analyte in Tab.C3.
Note! Internally, all calculations are performed on non-rounded values (as imported). Only the visual output to the user is rounded to improve readability. Because analytes within an RM might be measured at very different scales it is possible to define analyte specific precision values for rounding. The Precision (Tables)
value is used for most values displayed in tables.
The only exception is Tab.C3. Here, Precision (Cert. Val.)
is used to round \(\mu_c\) and \(U_{abs}\) to allow to comply with DIN/ISO regulations on rounding and columns containing relative uncertainties are rounded to fixed 4 digit precision.
The user can decide the rounding precision for \(\mu_c\) and \(U_{abs}\) and even specify negative numbers for Precision (Cert. Val.)
to indicate a desired rounding digit left of the decimal separator (i.e. \(-1\) to round an uncertainty of 13.1 up to 20). If the uncertainty is even larger, the user should consider to upload data using a different (more appropriate) unit and, for instance, switch from mg/L to g/L.
eCerto suggests the rounding precision for \(\mu_c\) and \(U_{abs}\) according to DIN 1333. The appropriate value is displayed right over the input box of Precision (Cert. Val.)
. It's background is colored green if it is in accordance with the user selected value and red otherwise.
The significant digit is determined based on the first non-zero digit in \(U_{abs}\). The position will be shown above the analyte parameter Precision (Cert. Val.)
upon selection of an analyte in Tab.C3. Once found, the respective digit will be rounded up for \(\mu_c\) if the consecutive digit is \(\ge 5\) and down otherwise. For \(U_{abs}\) the rounding procedure is a bit more complicated. If the respective digit is 1 or 2, than the consecutive digit will be rounded up. If the respective digit is \(\ge 3\), than it will be rounded up itself.
For an analyte selected in Tab.C3 various statistical tests regarding lab means and variance will be performed according to ISO 17025 and outlying values will be indicated if observed at the .05
and .01
level respectively. The displayed values n.s.
and .
indicate either that the test probability was >0.05 or that this lab was not tested because it did not contain an extreme value.
Note! eCerto performs outlier tests at the lower and upper end of the value distribution independently (i.e. two one sided tests). This might lead to confusion as P-values for both independent tests are displayed within the same column of Tab.C1. However, it allows to keep the output compact. Please always use results of Tab.C1 in parallel with Fig.C1 to decide on the removal of Labs.
Except for Cochran
, which tests for outliers with respect to variance, all other columns indicate potential outliers regarding lab means.
To compute the statistical tests eCerto uses functions from different packages available for R. Details regarding the conducted statistical test (implementation, parameters) can be found using the following links:
agricolae
package.excl
indicates that the sd
of a lab was too low and the lab was removed from the testing procedure.Note! Most tests require at least 3 data points (Labs) and finite differences in Lab means and Lab variances. If these conditions are not fulfilled Error
might be reported instead of a P-value.
Instead of the qualitative significance levels reached in the test, the options panel to the right of Tab.C1 offers the possibility to display the calculated P-values
directly for most of the tests. Further, the calculated test statistic
values can be displayed. The user can also select to show the critical values
at \(\alpha=0.05\) and \(\alpha=0.01\). Depending on the test these are calculated based on the underlying distributions or obtained from tabulated values.
If any lab is excluded after using the filter section of the analyte panel on top of the page, all tests in Tab.C1 can be conducted without filtered labs selecting the option Exclude filtered Labs
. This feature allows to confirm whether or not further outliers are present in the data set. Please use the possibility to remove outliers sequentially with caution, i.e. refrain from over exclusion of initially masked outliers.
The distribution of lab means is evaluated using a variety of recommended tests. Normality of the distribution is tested using the KS-Test. Besides mean
and sd
the robust alternatives median
and MAD
are provided. Columns which end on _p
provide P-values of the respective tests. Skewness and Kurtosis are computed additionally and grouped with their respective tests (Agostino and Anscombe).
To compute the statistical tests eCerto uses functions from different packages available for R. Details regarding the conducted statistical test can be found using the following links:
pooling
option for this analyte.Skewness
and Kurtosis
are computed using a two-sided alternative.Note! Some tests need a minimum number of replicates and will not yield a result if this criteria is not met (i.e. Agostino).
The is used to compare the distribution of Lab means against a normal distribution. Besides the KS
test P-value, deviation of the data from a normal distribution can also be visually investigated by opening a QQ-plot using the respective link.
Lab means and their measurement distribution is depicted in a standard graphic layout which can be exported as vector graphic (PDF) for further editing. Labs which have been identified as outliers in Tab.C1 can be excluded from the overall mean calculation in the analyte parameter section at the top of the page. This decision has to be made by the user, eCerto will not remove any outlier automatically. However, eCerto keeps track of the removal and indicates the omitted lab values in the plot as grey data points.
Besides width
and height
adjustment, the plot can be modified to include individual measurement using their ID (Show sample IDs
). This can be useful to identify potential individual outlier samples. Further, x-axis annotation can be altered to show anonymous Lx
values or the imported file names as labels.
A report of the above analysis (single analyte) or for the material (all analytes) can be generated and exported in HTML format.
Note! The current report layouts are for demonstration purpose only. Very likely, we will implement a set of recommended report layouts in the future upon user suggestions.
Tests used for outlier detection often rely on tabulated critical values. Decisions to reject or not reject a potential outlier can be dependent on the source of the critical values.
Where ever possible, we tried to omit using fixed tables and rather implemented critical value calculation ab initio. However, to be transparent, we detail the mathematical approach for each test in the following.
Note! Several test functions were implemented based on the functions in the R package outliers
by Lukasz Komsta. Modifications were made to allow testing up to n=100 laboratories (was restricted to n=30 in package outliers
).
The critical value is calculated internally using formula eCerto:::qcochran
## function(p, n, k) {
## f <- stats::qf(p / k, (n - 1) * (k - 1), n - 1)
## c <- 1 / (1 + (k - 1) * f)
## return(c)
## }
## <bytecode: 0x55f7561b7d38>
## <environment: namespace:eCerto>
where parameters p, n and k define the desired alpha level, the number of replicates per group and the number of groups respectively. stats::qf()
is the quantile function of the f-distribution as implemented in the R stats
package. Likewise, stats::pf()
is used to compute the Cochran P-value via function eCerto:::pcochran
## function(q, n, k) {
## f <- (1 / q - 1) / (k - 1)
## p <- 1 - stats::pf(f, (n - 1) * (k - 1), n - 1) * k
## p[p < 0] <- 0
## p[p > 1] <- 1
## return(p)
## }
## <bytecode: 0x55f755e87278>
## <environment: namespace:eCerto>
For comparison with critical values tabulated elsewhere you can generate a respective table in R. Show code
ps <- c(0.01, 0.05)
ns <- c(3:9, seq(10,20,5))
ks <- c(2:14, seq(15,30,5))
out <- lapply(ps, function(p) {
sapply(ns, function(n) {
sapply(ks, function(k) {
eCerto:::qcochran(p=p, n=n, k=k)
})
})
})
names(out) <- paste0("alpha=", ps)
lapply(out, function(x) {
colnames(x) <- paste0("n=", ns)
rownames(x) <- paste0("k=", ks)
round(x, 4)
})
Show Cochran critical values table
## $`alpha=0.01`
## n=3 n=4 n=5 n=6 n=7 n=8 n=9 n=10 n=15 n=20
## k=2 0.9950 0.9794 0.9586 0.9373 0.9172 0.8988 0.8823 0.8674 0.8113 0.7744
## k=3 0.9423 0.8832 0.8335 0.7933 0.7606 0.7335 0.7107 0.6912 0.6241 0.5838
## k=4 0.8643 0.7814 0.7212 0.6761 0.6410 0.6129 0.5897 0.5702 0.5054 0.4678
## k=5 0.7885 0.6957 0.6329 0.5875 0.5531 0.5259 0.5038 0.4853 0.4251 0.3907
## k=6 0.7218 0.6258 0.5635 0.5195 0.4866 0.4609 0.4401 0.4229 0.3672 0.3360
## k=7 0.6644 0.5685 0.5080 0.4659 0.4347 0.4105 0.3911 0.3751 0.3237 0.2950
## k=8 0.6152 0.5210 0.4627 0.4227 0.3932 0.3705 0.3523 0.3373 0.2896 0.2632
## k=9 0.5727 0.4810 0.4251 0.3870 0.3592 0.3378 0.3207 0.3067 0.2622 0.2377
## k=10 0.5358 0.4469 0.3934 0.3572 0.3308 0.3106 0.2945 0.2814 0.2397 0.2169
## k=11 0.5036 0.4175 0.3663 0.3318 0.3068 0.2876 0.2725 0.2601 0.2209 0.1995
## k=12 0.4751 0.3919 0.3428 0.3099 0.2861 0.2680 0.2536 0.2419 0.2049 0.1848
## k=13 0.4498 0.3695 0.3223 0.2909 0.2682 0.2509 0.2373 0.2261 0.1911 0.1721
## k=14 0.4272 0.3495 0.3042 0.2741 0.2525 0.2360 0.2230 0.2124 0.1792 0.1612
## k=15 0.4069 0.3318 0.2882 0.2593 0.2386 0.2228 0.2104 0.2003 0.1687 0.1516
## k=20 0.3297 0.2654 0.2288 0.2048 0.1877 0.1748 0.1646 0.1564 0.1308 0.1170
## k=25 0.2782 0.2220 0.1904 0.1699 0.1553 0.1443 0.1357 0.1287 0.1071 0.0956
## k=30 0.2412 0.1914 0.1635 0.1455 0.1327 0.1231 0.1156 0.1096 0.0909 0.0809
##
## $`alpha=0.05`
## n=3 n=4 n=5 n=6 n=7 n=8 n=9 n=10 n=15 n=20
## k=2 0.9750 0.9392 0.9057 0.8772 0.8534 0.8332 0.8159 0.8010 0.7487 0.7164
## k=3 0.8709 0.7977 0.7457 0.7070 0.6770 0.6531 0.6333 0.6167 0.5613 0.5289
## k=4 0.7679 0.6839 0.6287 0.5894 0.5598 0.5365 0.5175 0.5018 0.4500 0.4205
## k=5 0.6838 0.5981 0.5440 0.5063 0.4783 0.4564 0.4387 0.4241 0.3767 0.3500
## k=6 0.6161 0.5321 0.4803 0.4447 0.4184 0.3980 0.3817 0.3682 0.3247 0.3004
## k=7 0.5612 0.4800 0.4307 0.3972 0.3726 0.3536 0.3384 0.3259 0.2858 0.2635
## k=8 0.5157 0.4377 0.3910 0.3594 0.3362 0.3185 0.3043 0.2927 0.2556 0.2350
## k=9 0.4775 0.4027 0.3584 0.3285 0.3067 0.2901 0.2768 0.2659 0.2313 0.2122
## k=10 0.4450 0.3733 0.3311 0.3028 0.2823 0.2666 0.2541 0.2439 0.2115 0.1936
## k=11 0.4169 0.3482 0.3080 0.2811 0.2616 0.2468 0.2350 0.2254 0.1949 0.1782
## k=12 0.3924 0.3264 0.2880 0.2624 0.2440 0.2299 0.2187 0.2096 0.1808 0.1650
## k=13 0.3709 0.3074 0.2707 0.2463 0.2286 0.2152 0.2046 0.1960 0.1687 0.1538
## k=14 0.3517 0.2907 0.2554 0.2321 0.2153 0.2025 0.1924 0.1841 0.1582 0.1440
## k=15 0.3346 0.2758 0.2419 0.2195 0.2034 0.1912 0.1815 0.1737 0.1489 0.1355
## k=20 0.2705 0.2205 0.1921 0.1735 0.1602 0.1502 0.1422 0.1358 0.1157 0.1048
## k=25 0.2281 0.1846 0.1601 0.1441 0.1328 0.1242 0.1174 0.1120 0.0949 0.0857
## k=30 0.1979 0.1593 0.1377 0.1236 0.1137 0.1061 0.1003 0.0955 0.0806 0.0727
The test is originally based on:
"Snedecor, G.W., Cochran, W.G. (1980). Statistical Methods (seventh edition). Iowa State University Press, Ames, Iowa".
The Dixon test statistic is calculated depending on the number of labs (n) using the function eCerto:::dixon.test
## function(x, opposite = FALSE, two.sided = FALSE) {
## x <- sort(x[stats::complete.cases(x)])
## n <- length(x)
## if (n <= 7) {
## type <- "10"
## } else if (n > 7 & n <= 10) {
## type <- "11"
## } else if (n > 10 & n <= 13) {
## type <- "21"
## } else {
## type <- "22"
## }
## if (xor(((x[n] - mean(x)) < (mean(x) - x[1])), opposite)) {
## alt <- paste("lowest value", x[1], "is an outlier")
## q <- switch(type,
## "10" = (x[2] - x[1]) / (x[n] - x[1]),
## "11" = (x[2] - x[1]) / (x[n - 1] - x[1]),
## "21" = (x[3] - x[1]) / (x[n - 1] - x[1]),
## (x[3] - x[1]) / (x[n - 2] - x[1])
## )
## } else {
## alt <- paste("highest value", x[n], "is an outlier")
## q <- switch(type,
## "10" = (x[n] - x[n - 1]) / (x[n] - x[1]),
## "11" = (x[n] - x[n - 1]) / (x[n] - x[2]),
## "21" = (x[n] - x[n - 2]) / (x[n] - x[2]),
## (x[n] - x[n - 2]) / (x[n] - x[3])
## )
## }
## pval <- pdixon(q, n)
## if (two.sided) {
## pval <- 2 * pval
## if (pval > 1) pval <- 2 - pval
## }
## out <- list(
## "statistic" = c(Q = q),
## "alternative" = alt,
## "p.value" = pval,
## "method" = "Dixon test for outliers",
## "data.name" = "eCerto internal"
## )
## class(out) <- "htest"
## return(out)
## }
## <bytecode: 0x55f75297ea30>
## <environment: namespace:eCerto>
In eCerto the one-sided version of the Dixon test is applied consecutively at the lower and upper end. Critical values are stored in tabulated form internally (eCerto::cvals_Dixon
) and have been combined in a single table from this source. Show Dixon critical values table
## 0.001 0.002 0.005 0.01 0.02 0.05 0.1 0.2
## 3 0.999 0.998 0.994 0.988 0.976 0.941 0.886 0.782
## 4 0.964 0.949 0.921 0.889 0.847 0.766 0.679 0.561
## 5 0.895 0.869 0.824 0.782 0.729 0.643 0.559 0.452
## 6 0.822 0.792 0.744 0.698 0.646 0.563 0.484 0.387
## 7 0.763 0.731 0.681 0.636 0.587 0.507 0.433 0.344
## 8 0.799 0.769 0.724 0.682 0.633 0.554 0.480 0.386
## 9 0.750 0.720 0.675 0.634 0.586 0.512 0.441 0.352
## 10 0.713 0.683 0.637 0.597 0.551 0.477 0.409 0.325
## 11 0.770 0.746 0.708 0.674 0.636 0.575 0.518 0.445
## 12 0.739 0.714 0.676 0.643 0.605 0.546 0.489 0.420
## 13 0.713 0.687 0.649 0.617 0.580 0.522 0.467 0.399
## 14 0.732 0.708 0.672 0.640 0.603 0.546 0.491 0.422
## 15 0.708 0.685 0.648 0.617 0.582 0.524 0.470 0.403
## 16 0.691 0.667 0.630 0.598 0.562 0.505 0.453 0.386
## 17 0.671 0.647 0.611 0.580 0.545 0.489 0.437 0.373
## 18 0.652 0.628 0.594 0.564 0.529 0.475 0.424 0.361
## 19 0.640 0.617 0.581 0.551 0.517 0.462 0.412 0.349
## 20 0.627 0.604 0.568 0.538 0.503 0.450 0.401 0.339
## 25 0.574 0.550 0.517 0.489 0.457 0.406 0.359 0.302
## 30 0.539 0.517 0.484 0.456 0.425 0.376 0.332 0.278
## 35 0.511 0.490 0.459 0.431 0.400 0.354 0.311 0.260
## 40 0.490 0.469 0.438 0.412 0.382 0.337 0.295 0.246
## 45 0.475 0.454 0.423 0.397 0.368 0.323 0.283 0.234
## 50 0.460 0.439 0.410 0.384 0.355 0.312 0.272 0.226
## 60 0.437 0.417 0.388 0.363 0.336 0.294 0.256 0.211
## 70 0.422 0.403 0.374 0.349 0.321 0.280 0.244 0.201
## 80 0.408 0.389 0.360 0.337 0.310 0.270 0.234 0.192
## 90 0.397 0.377 0.350 0.326 0.300 0.261 0.226 0.185
## 100 0.387 0.368 0.341 0.317 0.292 0.253 0.219 0.179
Values for non-tabulated combinations of p and n are derived by interpolation using function eCerto:::qtable
## function(p, probs, quants) {
## quants <- quants[order(probs)]
## probs <- sort(probs)
## res <- vector()
## for (n in 1:length(p)) {
## pp <- p[n]
## if (pp <= probs[1]) {
## q0 <- quants[c(1, 2)]
## p0 <- probs[c(1, 2)]
## fit <- stats::lm(q0 ~ p0)
## } else if (pp >= probs[length(probs)]) {
## q0 <- quants[c(length(quants) - 1, length(quants))]
## p0 <- probs[c(length(probs) - 1, length(probs))]
## fit <- stats::lm(q0 ~ p0)
## } else {
## x0 <- which(abs(pp - probs) == min(abs(pp - probs)))
## x1 <- which(abs(pp - probs) == sort(abs(pp - probs))[2])
## x <- min(c(x0, x1))
## if (x == 1) {
## x <- 2
## }
## if (x > length(probs) - 2) {
## x <- length(probs) - 2
## }
## i <- c(x - 1, x, x + 1, x + 2)
## q0 <- quants[i]
## p0 <- probs[i]
## fit <- stats::lm(q0 ~ poly(p0, 3))
## }
## res <- c(res, stats::predict(fit, newdata = list(p0 = pp)))
## }
## return(res)
## }
## <bytecode: 0x55f757838000>
## <environment: namespace:eCerto>
The test is originally based on:
"Dixon, W.J. (1950). Analysis of extreme values. Ann. Math. Stat. 21, 4, 488-506"
"Dean, R.B.; Dixon, W.J. (1951). Simplified statistics for small numbers of observations. Anal.Chem. 23, 636-638"
"Dixon, W.J. (1953). Processing data for outliers. J. Biometrics. 9, 74-89"
The Grubbs test statistic is calculated either for a single or a double outlier at both ends of the distribution consecutively using eCerto:::grubbs.test
## function(x, type = 10, tail = c("lower", "upper")) {
## tail <- match.arg(tail)
## if (!any(type == c(10, 20))) {
## message("'grubbs.test' is only implemented for type = 10 or 20. Using type = 20 (double Grubbs).")
## }
## x <- sort(x[stats::complete.cases(x)])
## n <- length(x)
## if (type == 10) {
## # single Grubbs
## if (tail == "lower") {
## alt <- paste("lowest value", x[1], "is an outlier")
## o <- x[1]
## d <- x[2:n]
## } else {
## alt <- paste("highest value", x[n], "is an outlier")
## o <- x[n]
## d <- x[1:(n - 1)]
## }
## g <- abs(o - mean(x)) / stats::sd(x)
## u <- stats::var(d) / stats::var(x) * (n - 2) / (n - 1)
## pval <- 1 - pgrubbs(g, n, type = 10)
## method <- "Grubbs test for one outlier"
## } else {
## # double Grubbs
## if (tail == "lower") {
## alt <- paste("lowest values", x[1], "and", x[2], "are outliers")
## u <- stats::var(x[3:n]) / stats::var(x) * (n - 3) / (n - 1)
## } else {
## alt <- paste("highest values", x[n - 1], "and", x[n], "are outliers")
## u <- stats::var(x[1:(n - 2)]) / stats::var(x) * (n - 3) / (n - 1)
## }
## g <- NULL
## pval <- pgrubbs(u, n, type = 20)
## method <- "Grubbs test for two outliers"
## }
## out <- list(
## "statistic" = c(G = g, U = u),
## "alternative" = alt,
## "p.value" = pval,
## "method" = method,
## "data.name" = "eCerto internal"
## )
## class(out) <- "htest"
## return(out)
## }
## <bytecode: 0x55f757a15120>
## <environment: namespace:eCerto>
Critical values for the single Grubbs test are calculated internally using eCerto:::qgrubbs
## function(p, n, type = 10, rev = FALSE) {
## if (type == 10) {
## if (!rev) {
## t2 <- stats::qt((1 - p) / n, n - 2)^2
## return(((n - 1) / sqrt(n)) * sqrt(t2 / (n - 2 + t2)))
## } else {
## s <- (p^2 * n * (2 - n)) / (p^2 * n - (n - 1)^2)
## t <- sqrt(s)
## if (is.nan(t)) {
## res <- 0
## } else {
## res <- n * (1 - stats::pt(t, n - 2))
## res[res > 1] <- 1
## }
## return(1 - res)
## }
## } else {
## if (n > 30) warning("[qgrubbs] critical value is estimated for n>30")
## gtwo <- eCerto::cvals_Grubbs2
## pp <- as.numeric(colnames(gtwo))
## if (!rev) res <- qtable(p, pp, gtwo[n - 3, ]) else res <- qtable(p, gtwo[n - 3, ], pp)
## res[res < 0] <- 0
## res[res > 1] <- 1
## return(unname(res))
## }
## }
## <bytecode: 0x55f757d4ca10>
## <environment: namespace:eCerto>
For comparison with critical values tabulated elsewhere you can generate a respective table in R. Show code
ps <- c(0.01, 0.025, 0.05, 0.1)
ns <- c(3:10, seq(20,100,40))
out <- sapply(ps, function(p) {
sapply(ns, function(n) {
eCerto:::qgrubbs(p=p, n=n)
})
})
colnames(out) <- paste0("a=", ps)
rownames(out) <- paste0("n=", ns)
round(out, 4)
Show Grubbs (single) critical values table
## a=0.01 a=0.025 a=0.05 a=0.1
## n=3 0.5878 0.6033 0.6289 0.6787
## n=4 0.7575 0.7688 0.7875 0.8250
## n=5 0.8863 0.8961 0.9123 0.9452
## n=6 0.9892 0.9981 1.0131 1.0435
## n=7 1.0744 1.0828 1.0969 1.1258
## n=8 1.1469 1.1549 1.1685 1.1962
## n=9 1.2098 1.2176 1.2307 1.2576
## n=10 1.2652 1.2728 1.2856 1.3118
## n=20 1.6118 1.6184 1.6297 1.6529
## n=60 2.0999 2.1057 2.1155 2.1358
## n=100 2.3040 2.3095 2.3188 2.3381
Critical values for the double Grubbs test are stored in tabulated form internally (eCerto::cvals_Grubbs2
) and P-values are derived by comparison. Show Grubbs (double) critical values table
## 0.01 0.025 0.05 0.1
## 4 0.0000 0.0002 0.0008 0.0031
## 5 0.0035 0.0090 0.0183 0.0376
## 6 0.0186 0.0349 0.0565 0.0921
## 7 0.0440 0.0708 0.1020 0.1479
## 8 0.0750 0.1101 0.1478 0.1994
## 9 0.1082 0.1492 0.1909 0.2454
## 10 0.1415 0.1865 0.2305 0.2863
## 20 0.3909 0.4391 0.4804 0.5269
## 60 0.7089 0.7346 0.7553 0.7775
## 100 0.8018 0.8190 0.8326 0.8470
Note! Critical values for double Grubbs are obtained from the outliers
package for 4≤n≤30 and are estimated for 31≤n≤100 based on the approximation described in this paper.
The test is originally based on:
"Grubbs, F.E. (1950). Sample Criteria for testing outlying observations. Ann. Math. Stat. 21, 1, 27-58".
The Scheffe multiple comparison test is a re-implementation of the respective function published in the agricolae
package. It is calculated on the result of stats::lm()
. eCerto:::scheffe.test
## function(y, trt, alpha = 0.05) {
## name.y <- paste(deparse(substitute(y)))
## name.t <- paste(deparse(substitute(trt)))
## A <- y$model
## DFerror <- stats::df.residual(y)
## MSerror <- stats::deviance(y) / DFerror
## y <- A[, 1]
## ipch <- pmatch(trt, names(A))
## nipch <- length(ipch)
## for (i in 1:nipch) {
## if (is.na(ipch[i])) {
## return(trt)
## }
## }
## name.t <- names(A)[ipch][1]
## trt <- A[, ipch]
## if (nipch > 1) {
## trt <- A[, ipch[1]]
## for (i in 2:nipch) {
## name.t <- paste(name.t, names(A)[ipch][i], sep = ":")
## trt <- paste(trt, A[, ipch[i]], sep = ":")
## }
## }
## name.y <- names(A)[1]
## df <- subset(data.frame("value" = y, "Lab" = trt), is.na(y) == FALSE)
## # JL
## means <- plyr::ldply(split(df$value, df$Lab), function(x) {
## data.frame(
## "mean" = mean(x, na.rm = T),
## "sd" = stats::sd(x, na.rm = T),
## "n" = sum(is.finite(x)),
## stringsAsFactors = FALSE
## )
## }, .id = "Lab")
## ntr <- nrow(means)
## Fprob <- stats::qf(1 - alpha, ntr - 1, DFerror)
## Tprob <- sqrt(Fprob * (ntr - 1))
## nr <- unique(means[, 4])
## scheffe <- Tprob * sqrt(2 * MSerror / nr)
## statistics <- data.frame(
## "MSerror" = MSerror,
## "Df" = DFerror,
## "F" = Fprob,
## "Mean" = mean(df[, 1]),
## "CV" = sqrt(MSerror) * 100 / mean(df[, 1])
## )
## if (length(nr) == 1) {
## statistics <- data.frame(statistics, "Scheffe" = Tprob, "CriticalDifference" = scheffe)
## }
## comb <- utils::combn(ntr, 2)
## nn <- ncol(comb)
## dif <- rep(0, nn)
## LCL <- dif
## UCL <- dif
## pval <- rep(0, nn)
## for (k in 1:nn) {
## i <- comb[1, k]
## j <- comb[2, k]
## dif[k] <- means[i, 2] - means[j, 2]
## sdtdif <- sqrt(MSerror * (1 / means[i, 4] + 1 / means[j, 4]))
## pval[k] <- round(1 - stats::pf(abs(dif[k])^2 / ((ntr - 1) * sdtdif^2), ntr - 1, DFerror), 4)
## LCL[k] <- dif[k] - Tprob * sdtdif
## UCL[k] <- dif[k] + Tprob * sdtdif
## }
## pmat <- matrix(1, ncol = ntr, nrow = ntr)
## k <- 0
## for (i in 1:(ntr - 1)) {
## for (j in (i + 1):ntr) {
## k <- k + 1
## pmat[i, j] <- pval[k]
## pmat[j, i] <- pval[k]
## }
## }
## groups <- orderPvalue(means, alpha, pmat)
## names(groups)[1] <- name.y
## parameters <- data.frame(test = "Scheffe", name.t = name.t, ntr = ntr, alpha = alpha)
## rownames(parameters) <- " "
## rownames(statistics) <- " "
## rownames(means) <- means[, 1]
## means <- means[, -1]
## output <- list(
## statistics = statistics,
## parameters = parameters,
## means = means,
## comparison = NULL,
## groups = groups
## )
## class(output) <- "group"
## return(output)
## }
## <bytecode: 0x55f7586215d8>
## <environment: namespace:eCerto>
The test is originally based on:
"Steel, R.; Torri, J.; Dickey, D. (1997) Principles and Procedures of Statistics: A Biometrical Approach. pp189"
Please prepare all data in a single Excel file on the first table (other tables will be neglected). Please stick to the column order as given in the below example.
Note! You may omit column H_type
in your Excel input file if not needed.
The analyte
column has to contain values that match the corresponding data from the certification module. H_type
can be used to define different homogeneity traits for each analyte which can yield independent uncertainty contributions. Flasche
encodes the different samples where repeated measurements have been performed on. Each item in Flasche
should occur several times per analyte as these are your replicates to be considered in the statistics. value
has to contain numeric values only to allow calculations. unit
is neither checked nor converted but used for plot annotation only.
Note! value
data have to be determined similarly (same unit
) to the data from the certification trial obviously to yield a meaningful uncertainty value.
The uncertainty contribution of the Homogeneity measurements u_bb
(uncertainty between bottles) is calculated for each analyte based on variance obtained by an ANOVA and is transferred to a user specified column in the material table of the certification module (Tab.C3).
Note! Values for u_bb
can be transferred to Tab.C3 in case that matching analyte names are present (analyte names are depicted in red if not found).
Statistical differences between groups are nicely visualized using boxplots. While the ANOVA allows to calculate a probability testing the hypothesis that the mean values of different groups are similar or not, the boxplot visualizes the distribution of values within each group.
The box comprises 50% of the measured values of a group (from 25% to 75% quartile) and highlights the median (50% quartile). Overlapping boxes indicate that a t-test would not yield a significant P-value.
Within Fig.H1 dashed black and grey lines indicate \(\mu\) and \(s\) of the distribution of specimen means from Tab.H2.
Especially when several analytes are tested in a material using a small number of replicates per specimen, ANOVA might result in P-values below the alpha-level. Observing such a P-value does not automatically render the material non homogeneous. Due to multiple testing such observations might be False Positive results. However, they require a more careful inspection of the data to identify a potential systematic bias. To this end, eCerto offers several options:
We first conduct an analysis of variance (ANOVA) on a one factor linear model for \(N\) bottles (or containers) with \(n\) replicate measurements each. The ANOVA is preformed for each analyte
and H_type
(if specified during upload) independently. The P
-value of each ANOVA is provided in Tab.H1 together with the variance within bottles \(s_w\) (M_within
) and the variance between bottles \(s_a\) (M_between
).
A significant ANOVA P-value indicates non homogeneous specimen and is highlighted in red color in Tab.H1. P-values are adjusted for multiple testing in case of several analytes being under investigation per specimen using bonferroni correction. Adjustment can be switched off in the options panel next to Tab.H1.
Note! To account for the case of different number of replicates over all \(N\) bottles, \(n\) is calculated according to ISO GUIDE 35 as:
\[n=\frac{1}{N-1} \times \left[\sum_{i=1}^N{n_i}-\frac{\sum_{i=1}^N{n_i^2}}{\sum_{i=1}^N{n_i}}\right]\]
where \(n_{i}\) is the vector of replicates per group. Together with the overall mean vlaue \(\mu\) (mean of all \(N\) bottle means), we now can compute two relative uncertainties between bottles:
\[s_{bb}=\frac{\sqrt{\frac{s_a-s_w}{n}}}{\mu}\]
and
\[s_{bb, min}=\frac{ \sqrt{ \frac{s_w}{n} } \times \sqrt[4]{ \frac{2}{N \times (n-1)} }}{\mu}\]
Note! When \(s_{a} < s{_w}\) we set \(s_{bb}=0\).
The larger of both values, \(s_{bb}\) and \(s_{bb,min}\) (rendered in bold font), is selected as uncertainty contribution when the user decides to transfer an uncertainty value to the material table.
A transfer is only possible for analytes which have been found in the table of certified values (Tab.C3 in the certification module). Analytes not present there will be rendered in red.
Figures and tables can currently be exported in HTML format.
Note! Report layout and format are currently under debate and might change in the future depending on user demand.
Stability data can be uploaded from an Excel file containing a single table including columns analyte
, Date
, Value
and unit
. An additional column Temp
can be included to allow shelf life calculations based on the Arrhenius model.
The Temp
column obviously should contain temperature levels in °C at which samples were stored before measurement. The lowest temperature level will be used as a reference point \(T_\mathit{ctrl}\). This is also reflected in the specified Date
values, where \(T_\mathit{ctrl}\) will be treated as a reference time point. All other temperature levels should have later dates specified. For every measurement, the difference between its date and the reference time point will be computed to represent the storage time at this temperature.
Using dates instead of specifying storage time directly allows to compute stability and potential shelf life in both, relative (month) and absolute (date) values.
Note! This option (simple Excel layout) might be discarded in the future. If you are new to eCerto please use Option 1.
Prepare data in a single Excel file, using separate tables for each analyte as shown in the below example.
Table names will be used as analyte names and should match analyte names from the Certification module. Column names need to be exactly Value
and Date
with Excel column formats set to numeric and date respectively.
Upload can be also achieved from a previously created backup file. If a backup file does not contain any stability data, the Excel upload option will remain active (and will become deactivated if stability data are contained).
The uncertainty contribution of the stability measurements u_stab
is calculated for each analyte based on a linear model of Value
on Date
, or, more precisely, on the monthly difference mon_diff
calculated based on the imported Date
values.
The data and linear model is visualized in Fig.S1. Here, in addition to the monthly difference plotted on the bottom axis, start and end date of the data points are depicted at the top of the plot. The red line indicates the certified value \(\mu_c\) and the two green dashed lines provide either the standard deviation \(2s\) of the data or \(U_{abs}\) from the table of certified values, based on availability and user choice.
Note! If no table of certified values is present or the analyte is yet unconfirmed, than red and green lines represent mean and standard deviation of the currently depicted stability data.
The dashed blue line indicates the linear model. It's parameters are transferred to Tab.S1 and are used to calculate u_stab
.
The parameters of all linear models are collected in Tab.S1 and the potential uncertainty contribution of the material stability is obtained from formula \(u_{stab}={|t_{cert} \times s(b_1)| \over \mu_s}\) where \(t_{cert}\) is the expected shelf life of the CRM (in month) and \(s(b_1)\) is the standard error SE
of the slope of the linear model. \(\mu_s\) is calculated as the mean of all values of an analyte depicted in Fig.S1, i.e. all values included in the linear model calculation.
The expected shelf life can be set by the user and should incorporate the time until the first certification monitoring and the certified shelf life of the material. This estimation of stability uncertainty is based on section 8.7.3 of ISO GUIDE 35:2017 and valid in the absence of a significant trend.
To determine if the slope \(b_1\) is significantly different from \(b_1=0\) we perform a t-test by calculating the t-statistic \(t_m = |b_1| / s(b_1)\) and compare the result with the two-tailed critical value of Student's \(t\) for \(n - 2\) degrees of freedom to obtain the P-values in column P
.
Note! Clicking on a table row will display the analysis for the analyte specified in this row.
Values from column u_stab_
can be transferred to a user defined column of the material table Tab.C3 in the certification module for matching analyte names. Analytes of Tab.S1 which can not be matched to Tab.C3 are depicted in red.
The Arrhenius approach is one way to estimate the stability of a reference material from the perspective of storage time or shelf life. As implemented in eCerto, the estimation is based on time series data of analytes at different temperature levels \(T_i\).
\(T_{min}\) is used as a reference point. For each \(T_i\) a linear model is calculated, yielding potentially a slope \(k_{\mathit{eff}}(T)\) indicative of analyte degradation. Combining all \(k_{\mathit{eff}}\) allows to estimate the dependency of analyte stability on temperature and, consequently, the expected storage time at a given \(T\).
The calculations are performed according to the recommendations for accelerated stability studies in ISO GUIDE 35:2017.
The analyte values measured at \(T_{min}\) are considered as stable and their mean is used as reference time point \($t=0$\) for each temperature level \(T_i\). When measured values are depicted relative to this mean, eCerto calculates and annotates the recovery and RSD for each \(T_i\).
When the above ratios are log-transformed and the x-axis (time) is set to month, eCerto calculates the temperature-dependent reaction rates \(k_{\mathit{eff}}(T)\) as the slope for each linear model of \(T_i\).
Besides Rec
overy and RSD
for each temperature level \(T_i\), the slope k_eff
and the log-transformation of it's negative value log(-k_eff)
are provided in Tab.S2.
In the case that at least 3 finite values for \(log(-k_{\mathit{eff}})\) exist, a linear model over 1/K
is calculated (depicted in Fig.S3). Based on slope \(m\) and intercept \(n\) of this model values \(K_i\) (provided in log(k)_calc
) are established using \(K_i = m \times x_i + n\).
To estimate the confidence interval of the model (CI_upper
and CI_lower
) we need to estimate some intermediate values provided in the bottom table. Here, eCerto calculates \(a=\sum{x_i}\), \(b=\sum{x_i^2}\) and \(n=length(x_i)\) where \(x_i=1/K\) together with the standard error \(err\) for this model:
\[err=\sqrt{\frac{1}{n-2} \times \sum{(y_i-\overline{y})^2} - \frac{\sum{(x_i-\overline{x}) \times (y_i-\overline{y})}^2}{\sum{(x_i-\overline{x})^2}}}\]
Next, these four values \(a\), \(b\), \(n\) and \(err\) allow to compute the dependent variables:
\[u(i) = \sqrt{\frac{err^2 \times b}{(b \times n-a^2)}}\]
\[u(s) = \sqrt{\frac{err^2 \times n}{(b \times n-a^2)}}\]
\[cov = -1 \times \frac{err^2 \times a}{(b \times n-a^2)}\]
which, finally, can be used to estimate the confidence interval \(\mathit{CI}\) as:
\[\mathit{CI}_{(upper,~lower)} = K_i \pm \sqrt{ u(i)^2 + u(s)^2 \times x_i^2 + 2x_i \times cov }\]
When a certified value \(\mu_c\) and corresponding uncertainty \(U_{abs}\) (cert_val
and U_abs
) are available for an analyte in Tab.C3 of the certification module of eCerto, these can be used together with \(\mathit{CI}_{upper}\) to calculate the storage time \(S_{Month}\) for each evaluated temperature level using:
\[S_{\mathit{Month}}(T) = \frac{ \log( \frac{\mu_c - U_{abs}}{\mu_c})}{e^{\mathit{CI}_{upper}(T)} }\]
Note! Extrapolation beyond the range of storage conditions tested (for example, predicting degradation rates at −20°C from an experiment involving only temperatures above 0°C) can be unreliable and is not recommended.
The model parameters calculated in Tab.S2 are visualized in Fig.S3 plotting the log transformed negative \(k_{\mathit{eff}}(T)\) over the inverse temperature [1/K].
The axis at the top of Fig.S3 indicates the corresponding temperature level for each data point in °C.
The table of certified values (Tab.C3) combines all data collected within a certification trial to provide for each measured entity within the certified material the certified mean \(\mu_c\) (cert_val
) and the certified uncertainty U_abs
.
Note! Only for analytes which have been visually inspected by the user certified values will be transfered to this table to ensure that inspection took place. The reported mean
is similar to the value obtained in Fig.C1 but will be different to Tab.C2 when Labs are filtered for an analyte.
Several columns of the material table can be edited by the user by clicking in an individual cell and entering a new value.
Table column descriptions:
mean
and sd
give the arithmetric mean \(\mu\) and standard deviation \(s\) for n
analytical values of this analyte similar to Fig.C1n
is dependent on pooling
and gives the number of values used for calculting \(\mu\) and \(s\). To this end, n
is either similar to the number of finite measurement values (pooling
allowed) or similar to the number of labs included in the analysis (pooling
not allowed).
pooling
not allowed, n=3
). If a laboratory is filtered, it will be grayed-out in the plot and not considered for \(\mu\) (n=2
). If the user decides to allow pooling
than n
will reflect the number of finite measurement values from all labs for this analyte (n = 2 x 5 = 10
).F
columns can be added/edited and may contain correction factors \(f_i\) to adjust \(\mu\) and obtain the certified mean cert_val
by \(\mu_c = \mu \times \prod f_i\)u
columns can be added and may contain relative uncertainty contributions (e.g. from homogeneity or stability modules)
Note! u
columns containing different uncertainty contributions should always be provided relative to the mean to allow an easy combination of the individual terms. Only U_abs
is expressed as an absolute value.
u_char
describes the characteristic uncertainty, calculated as \(u_{char}=\frac{s}{\sqrt{n} \times mean}\)u_com
is the combined uncertainty from all u
columns, where all relative uncertainty components \(u_i\) are combined using formula \(u_{com}=\sqrt{\sum{u_i^2}}\)k
is an additional expansion factor for the uncertainty \(U=u_{com} \times k\)U_abs
specifies the absolute uncertainty and is calculated by \(U_{abs}=U \times \mu_c\)
Note! Numeric values of \(\mu_c\) and \(U_{abs}\) are rounded to the number of digits specified by the user at the analyte selection panel (top, Precision (Cert. Val.)
). A precision according to DIN 1333 based on \(U_{abs}\) is suggested. However, all relative u
columns are depicted with fixed 4 digits precision.
Tab.C3 can be exported within a report using the download section on top of the page.
F
and U
columns in Tab.C3 can be added, removed or renamed using the drop down menu next to the table. To add a new U
column select option <new U>
, to rename an existing column, modify the column in the text box. To delete a column, simply delete it's column name. All modifications take place only after the user confirms by pushing the respective button within the drop down menu.
Note! eCerto will respect HTML formatting, i.e. if you want to use subscript in your column name you can use u<sub>hom</sub>
as input to create a column with displayed name \(u_{hom}\).
Default values are 1 and 0 for F
and U
columns respectively, as these are neutral with respect to the result. In F
and U
columns all values can be edited manually by double clicking the respective cell, entering a new value and clicking outside the cell to confirm the modification.
More conveniently, U
columns can also be filled automatically with data from the homogeneity and stability modules of eCerto.
Note! Manual modifications will be overwritten by any automatic transfer into the same column.
The only other column which can be manually edited is k
, the expansion factor for u_com
. All other columns can not be modified by the user and will only show a grey background upon double click. This is on purpose to ensure a robust workflow and comprehensible statistical results. It requires a careful preparation of input data files, e.g. providing reasonable and consistent analyte names and units.
To check if the certified value of a CRM \(\mu_c\) remains stable, a post certification measurement can be performed and evaluated according to ISO GUIDE 35:2017 (8.10.3.2) by calculating a stability criteria \(\mathit{SK}\) using formula:
\[\mathit{SK} = {|\mu_c - \mu_m| \over \sqrt{u_c^2 + u_m^2}}\]
where \(\mu_m\) is the mean and \(u_m = \sigma / \sqrt{N}\) is the standard uncertainty of post certification measurements performed on a CRM specimen, and where \(u_c = u_{com} \times \mu_c\) is the absolute certified uncertainty without expansion.
If \(\mathit{SK} \le k\) is fulfilled, this indicates that the material is sufficiently stable.
Note! \(k\) is set based on the expansion factor associated with the analyte given in Tab.C3. It is recommended to keep \(k=2\) as this will ensure a 95% confidence for \(\mathit{SK}\).
eCerto can be used to monitor long term stability of a material. To this end, two approaches were implemented. A quick post certification check for stability can be preformed in the Certification module next to Tab.C3. There, calculation of \(\mathit{SK}\) allows to access if analyte values within a material can still be considered stable at measurement time of control samples.
However, to estimate long term stability prospectively, measurement data can be continuously obtained and stored using the LTS module of eCerto as described below.
Please prepare data in a single Excel file on separate tables for each measured value (KW, Kennwert) as shown in the below example. Table and File names are not evaluated. Instead, each table is expected to contain the metadata in rows 1-2 and the measurement data from row 4 on wards. Column names need to be exactly as in the example and should match the desired format (e.g. Value
and Date
columns should be of numeric and date format respectively).
The File
column can be used to reference the data source for traceability of measurement values. The Comment
column can be used to provide additional information.
The meta data information can not be further edited in eCerto. Hence, the user should carefully consider the provided data. U_Def
for example should specify which uncertainty value was originally certified and is provided in U
. This can be one out of five potential options:
1s
or 2s
indicating the simple or double varianceCI
confidence interval1sx
or 2sx
indicating the relative simple or double varianceOther values will not be recognized correctly. Most meta data values will be used for plot annotations and in the LTS report only and can be omitted. However, to comply with the rules for good scientific practice it is recommended to provide them.
The Tool calculates the expected life time of a reference material in several steps:
The calculation results are depicted in Fig.L1 and can be exported as a report in PDF format.
Note! The \(U\) defining the interval around \(\mu_c\), which we expect the property values to remain in within the RM life time, is taken from the data read upon initial Excel import. The user should be careful regarding the value specified here to avoid overestimating the life time. LTS monitoring, which is usually performed within the same lab, will cover mostly the uncertainty due to stability of a material property. The uncertainty defined in the original certificate will cover additional uncertainty contributions (i.e. from the collaborative trial). Hence, it might be adequate to use only a fraction of the certified \(U\) value to define the interval.
Note! The parameters of a linear model, i.e. \(b_1\), can only be determined with some uncertainty. While the current report layout calculates the life time based on \(b_1\) as described above, a more conservative estimate would be to use the confidence interval, \(CI_{95}(b_1)\), instead. Calculation based on \(CI_{95}(b_1)\) is shown in Fig.L1 by default.
After data upload, the user can add new data points for a selected property by manually editing the 4 fields in the input area and clicking on the button Add data
.
Comments to a data point are optional. To prevent non-deliberate modification of measurement data, only comments can be changed by the user after data upload. Comment modification can be conviniently performed by selecting a data point, either in Tab.L1 or in Fig.L1, entering a new text in the comment
input field and confirming the modification by clicking the button Add comment
.
The user can export a PDF report containing all imported, edited and calculated data and figures.
Note! The Validation module is an add-on to eCerto. It is not required to compute CRM statistics. It is ment to quickly validate working area and linearity of a novel developed analytical method. As it is currently work-in-progress, available options and output format might change in the future.
eCerto implements statistical procedures for analytical method validation according to DIN 32645:2008-11.
The method performance characteristics reported by eCerto comprise working range, linearity, limit of detection (LOD), limit of quantification (LOQ) ...
The statistical tests implemented in eCerto according to DIN 32645:2008-11 comprise outlier tests (Grubbs), trend tests (Neumann), tests for homogeneity of variance (F-Test), ...
The implemented formulas are provided or referenced in the respective help sections.
To compute the relevant performance characteristics of an analytical method, eCerto will evaluate replicate measurements of a dilution series of samples containing any number of target analytes and their internal standards.
For advice on concentration range, replicate number and calibration levels tested, we refer to DIN 32645:2008-11
Please prepare data in a single Excel file using the Agilent MassHunter Software.
The information in columns Name
and Type
are kept for later reference only. The information on analyte names is extracted from the red area. The information on calibration levels (similar for all analytes) and according analyte concentrations within each level (analyte specific) is used as the independent variable \(x\) in all figures and tables. The information on peak area of each analyte and its respective internal standard (IS) is used as the dependent variable \(y=f(x)\) in all figures and tables.
Note! Empty cells in the columns Level
and Exp. Conc.
will be filled with the closest finite value above, i.e. it is assumed that all samples in rows 4 to 12 in the example are replicates of calibration level 1 (as defined in row 3).
Alternatively you can set up your data similar to the following layout.
eCerto will try to determine the used format upon upload automatically.
The working range of an analytical method is tested using the relative analyte values \(x_r\) of the smallest and the largest calibration level, \(j_1\) and \(j_N\).
Relative analyte values are computed as \(x_r = \frac {x_{i,j}} {\overline{x}_j}\) with \(x_{i,j}\) being the peak area ratios of analyte and internal standard \(x_{i,j}=\frac{A_\text{Analyte}}{A_\text{IS}}\) for each replicate \(i\) at calibration level \(j\).
For each \(x_j\) in total \(n\) replicate values exist and are tested:
The two calibration levels \(j_1\) and \(j_N\) are tested for homogeneity of variance using an F-Test.
The linearity of an analytical method is tested using the analyte values \(\overline{x}_j\).
Analyte values are computed as the peak area ratios of analyte and internal standard \(x_{i,j}=\frac{A_{Analyte}}{A_{IS}}\) for each replicate \(i\) at calibration level \(j\). \(\overline{x}_j\) is the mean of the \(n\) replicates at level \(j\). The total number of calibration levels is denoted as \(N\).
For each analyte, a linear model \(y=b_0+b_1 \times x\) over all \(\overline{x}_j\) is computed and the following parameters are reported:
Additionally, the residuals \(e\) of the linear model are tested:
For comparison the data is fitted using a quadratic model \(y=b_0+b_1 \times x+b_2 \times x^2\). The residuals from both models, the linear and the quadratic one, are compared using a Mandel-Test calculating \(P_{Mandel}\).
This is the collection of arreviations and formulas used in the method validation module of eCerto. While the calculations follow generally the recommendations given in DIN 32645:2008-11, we hope that this redundancy may serve as a quick reference when using eCerto.
\(N_j\) total number of calibration levels with \(j\) denoting the j-th level
\(n_i\) (minimal) number of replicates within a calibration level with \(i\) denoting the i-th replicate
\(x_{i,j}\) denoting the peak area ratio of analyte and internal standard (IS) in replicate \(i\) of level \(j\) calculated as \(x_{i,j}=\frac{A_\text {Analyte}}{A_\text {IS}}\)
\(x_r\) denoting the relative analyte level \(x_r = \frac {x_{i,j}} {\overline{x}_j}\)
\(b_0, b_1, (b_2)\) denoting the coefficients (intercept, slope, ...) of a linear (quadratic) model fitting the data
\(e\) denoting the residuals (error) of a model
\(k\) denoting the result uncertainty specified by the user
\(t_{f,\alpha}\) denoting the \(t\) distribution with \(f\) degrees of freedom and probability \(\alpha\)
\(s_{x,y}\) denoting the standard error of estimate of a linear model of \(N\) levels with residuals \(e\) calculated as \(s_{x,y}=\sqrt{\frac{\sum e^2}{N-2}}\)
\(\text {LOD}\) limit of detection of a linear model of \(N\) levels with \(n\) replicates having slope \(b_1\) and residuals \(e\) calculated as \[\text {LOD} = k \times \frac {s_{x,y}} {b_1} \times t_{f,\alpha} \times \sqrt {{1 \over n} + {1 \over N} + \frac {\overline{x}^2} {\sum (x-\overline{x})^2}}\] where \(f = N-2\) and \(\alpha\) is specified by the user
... (more to come)
The user can export a PDF report containing all imported, edited and calculated data and figures.