From 9411139beee167c5339e96db448e5dbed19e06bc Mon Sep 17 00:00:00 2001 From: Johannes Ranke Date: Thu, 5 Jul 2018 18:44:22 +0200 Subject: Maintenance in preparation of improvements - Switch vignette to html - Switch tests to testthat - NEWS.md instead of ChangeLog - Remove names of y in lists returned by lod and loq --- ChangeLog | 11 - DESCRIPTION | 14 +- GNUmakefile | 16 +- NEWS.md | 18 ++ R/lod.R | 1 + R/loq.R | 1 + tests/din32645.R | 7 - tests/din32645.Rout.save | 48 ----- tests/massart97.R | 25 --- tests/massart97.Rout.save | 128 ------------ tests/testthat.R | 4 + tests/testthat/test_din32645.R | 33 +++ tests/testthat/test_inverse.predict.R | 43 ++++ tests/testthat/test_lod_loq.R | 28 +++ tests/testthat/test_massart.R | 40 ++++ vignettes/chemCal.Rmd | 134 +++++++++++++ vignettes/chemCal.Rnw | 130 ------------ vignettes/chemCal.html | 355 +++++++++++++++++++++++++++++++++ vignettes/chemCal.pdf | Bin 165600 -> 0 bytes vignettes/figure/unnamed-chunk-1-1.pdf | Bin 0 -> 12114 bytes vignettes/figure/unnamed-chunk-2-1.pdf | Bin 0 -> 5027 bytes vignettes/refs.bib | 7 + 22 files changed, 678 insertions(+), 365 deletions(-) delete mode 100644 ChangeLog create mode 100644 NEWS.md delete mode 100644 tests/din32645.R delete mode 100644 tests/din32645.Rout.save delete mode 100644 tests/massart97.R delete mode 100644 tests/massart97.Rout.save create mode 100644 tests/testthat.R create mode 100644 tests/testthat/test_din32645.R create mode 100644 tests/testthat/test_inverse.predict.R create mode 100644 tests/testthat/test_lod_loq.R create mode 100644 tests/testthat/test_massart.R create mode 100644 vignettes/chemCal.Rmd delete mode 100644 vignettes/chemCal.Rnw create mode 100644 vignettes/chemCal.html delete mode 100644 vignettes/chemCal.pdf create mode 100644 vignettes/figure/unnamed-chunk-1-1.pdf create mode 100644 vignettes/figure/unnamed-chunk-2-1.pdf create mode 100644 vignettes/refs.bib diff --git a/ChangeLog b/ChangeLog deleted file mode 100644 index c258c26..0000000 --- a/ChangeLog +++ /dev/null @@ -1,11 +0,0 @@ -2014-04-24 Johannes Ranke for chemCal (0.1-33) - - * Bugfix in lod() and loq(): In the case of small absolute x values (e.g. on - the order of 1e-4 and smaller), the lod or loq calculated using the default - method could produce inaccurate results as the default tolerance that was - used in the internal call to optimize is inappropriate in such cases. Now a - reasonable default is used which can be overriden by the user. Thanks to - Jérôme Ambroise for reporting the bug. - -Changes performed in earlier versions are documented in the subversion log -files found at http://kriemhild.uft.uni-bremen.de/viewcvs/?root=chemCal diff --git a/DESCRIPTION b/DESCRIPTION index 0664ad3..7610ff0 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,11 +1,11 @@ Package: chemCal -Version: 0.1-37.9001 -Date: 2017-03-15 +Version: 0.2.1 +Date: 2018-07-05 Title: Calibration Functions for Analytical Chemistry Authors@R: c(person("Johannes", "Ranke", role = c("aut", "cre", "cph"), email = "jranke@uni-bremen.de", comment = c(ORCID = "0000-0003-4371-6538"))) -Suggests: MASS +Suggests: MASS, knitr, testthat Description: Simple functions for plotting linear calibration functions and estimating standard errors for measurements according to the Handbook of Chemometrics and Qualimetrics: Part A @@ -14,6 +14,10 @@ Description: Simple functions for plotting linear The functions work on model objects from - optionally weighted - linear regression (lm) or robust linear regression ('rlm' from the 'MASS' package). License: GPL (>= 2) -URL: http://www.r-project.org, - http://chem.uft.uni-bremen.de/ranke/?page=chemCal, http://cgit.jrwb.de/chemCal +LazyLoad: yes +LazyData: yes +VignetteBuilder: knitr +Encoding: UTF-8 +BugReports: http://github.com/jranke/chemCal/issues +URL: https://pkgdown.jrwb.de/chemCal, https://cgit.jrwb.de/chemCal/about RoxygenNote: 6.0.1 diff --git a/GNUmakefile b/GNUmakefile index 730c460..3f16fbc 100644 --- a/GNUmakefile +++ b/GNUmakefile @@ -15,25 +15,16 @@ README.html: README.md build: "$(RBIN)/R" CMD build . -build-no-vignettes: - "$(RBIN)/R" CMD build --no-build-vignettes . - install: build "$(RBIN)/R" CMD INSTALL $(TGZ) -install-no-vignettes: build-no-vignettes - "$(RBIN)/R" CMD INSTALL $(TGZ) - check: build "$(RBIN)/R" CMD check --as-cran $(TGZ) -check-no-vignettes: build-no-vignettes - "$(RBIN)/R" CMD check --as-cran $(TGZ) - -vignettes/%.pdf: vignettes/%.Rnw +vignettes/%.html: vignettes/%.Rmd "$(RBIN)/Rscript" -e "tools::buildVignette(file = 'vignettes/$*.Rnw', dir = 'vignettes')" -vignettes: vignettes/chemCal.pdf +vignettes: vignettes/chemCal.html pd: "$(RBIN)/Rscript" -e "pkgdown::build_site()" @@ -46,3 +37,6 @@ winbuilder: build curl -T $(TGZ) ftp://anonymous@win-builder.r-project.org/R-release/ @echo "Uploading to R-devel on win-builder" curl -T $(TGZ) ftp://anonymous@win-builder.r-project.org/R-devel/ + +test: install + NOT_CRAN=true "$(RBIN)/Rscript" -e 'devtools::test()' diff --git a/NEWS.md b/NEWS.md new file mode 100644 index 0000000..2656a25 --- /dev/null +++ b/NEWS.md @@ -0,0 +1,18 @@ +# chemCal 0.2.1 (2018-07-06) + +- 'lod' and 'loq': In the lists that are returned, return the list component 'y' without names, because we always have a single element in 'y' and the name '1' is meaningless + +- Use testthat for tests to simplify further development + +- Convert vignette to html + +# chemCal 0.1-33 (2014-04-24) + +- Bugfix in lod() and loq(): In the case of small absolute x values (e.g. on + the order of 1e-4 and smaller), the lod or loq calculated using the default + method could produce inaccurate results as the default tolerance that was + used in the internal call to optimize is inappropriate in such cases. Now a + reasonable default is used which can be overriden by the user. Thanks to + Jérôme Ambroise for reporting the bug. + +For a detailed list of changes to the chemCal source please consult the commit history on https://cgit.jrwb.de/chemCal diff --git a/R/lod.R b/R/lod.R index 5b74418..bccdf3e 100644 --- a/R/lod.R +++ b/R/lod.R @@ -48,6 +48,7 @@ lod.lm <- function(object, ..., alpha = 0.05, beta = 0.05, method = "default", t newdata <- data.frame(x = lod.x) names(newdata) <- xname lod.y <- predict(object, newdata) + names(lod.y) <- NULL } lod <- list(lod.x, lod.y) names(lod) <- c(xname, yname) diff --git a/R/loq.R b/R/loq.R index f832265..26d6a53 100644 --- a/R/loq.R +++ b/R/loq.R @@ -35,6 +35,7 @@ loq.lm <- function(object, ..., alpha = 0.05, k = 3, n = 1, w.loq = "auto", newdata <- data.frame(x = loq.x) names(newdata) <- xname loq.y <- predict(object, newdata) + names(loq.y) <- NULL loq <- list(loq.x, loq.y) names(loq) <- c(xname, yname) return(loq) diff --git a/tests/din32645.R b/tests/din32645.R deleted file mode 100644 index e5ffed7..0000000 --- a/tests/din32645.R +++ /dev/null @@ -1,7 +0,0 @@ -require(chemCal) -data(din32645) -m <- lm(y ~ x, data = din32645) -inverse.predict(m, 3500, alpha = 0.01) -lod <- lod(m, alpha = 0.01, beta = 0.5) -lod(m, alpha = 0.01, beta = 0.01) -loq <- loq(m, alpha = 0.01) diff --git a/tests/din32645.Rout.save b/tests/din32645.Rout.save deleted file mode 100644 index 7c9e55d..0000000 --- a/tests/din32645.Rout.save +++ /dev/null @@ -1,48 +0,0 @@ - -R version 3.1.0 (2014-04-10) -- "Spring Dance" -Copyright (C) 2014 The R Foundation for Statistical Computing -Platform: x86_64-pc-linux-gnu (64-bit) - -R is free software and comes with ABSOLUTELY NO WARRANTY. -You are welcome to redistribute it under certain conditions. -Type 'license()' or 'licence()' for distribution details. - -R is a collaborative project with many contributors. -Type 'contributors()' for more information and -'citation()' on how to cite R or R packages in publications. - -Type 'demo()' for some demos, 'help()' for on-line help, or -'help.start()' for an HTML browser interface to help. -Type 'q()' to quit R. - -> require(chemCal) -Loading required package: chemCal -> data(din32645) -> m <- lm(y ~ x, data = din32645) -> inverse.predict(m, 3500, alpha = 0.01) -$Prediction -[1] 0.1054792 - -$`Standard Error` -[1] 0.02215619 - -$Confidence -[1] 0.07434261 - -$`Confidence Limits` -[1] 0.03113656 0.17982178 - -> lod <- lod(m, alpha = 0.01, beta = 0.5) -> lod(m, alpha = 0.01, beta = 0.01) -$x -[1] 0.132909 - -$y - 1 -3765.025 - -> loq <- loq(m, alpha = 0.01) -> -> proc.time() - user system elapsed - 0.472 0.302 0.354 diff --git a/tests/massart97.R b/tests/massart97.R deleted file mode 100644 index 00f837f..0000000 --- a/tests/massart97.R +++ /dev/null @@ -1,25 +0,0 @@ -require(chemCal) -data(massart97ex1) -m <- lm(y ~ x, data = massart97ex1) -inverse.predict(m, 15) # 6.1 +- 4.9 -inverse.predict(m, 90) # 43.9 +- 4.9 -inverse.predict(m, rep(90,5)) # 43.9 +- 3.2 - -data(massart97ex3) -attach(massart97ex3) -yx <- split(y, x) -ybar <- sapply(yx, mean) -s <- round(sapply(yx, sd), digits = 2) -w <- round(1 / (s^2), digits = 3) -weights <- w[factor(x)] -m <- lm(y ~ x, w = weights) -#calplot(m) - -inverse.predict(m, 15, ws = 1.67) # 5.9 +- 2.5 -inverse.predict(m, 90, ws = 0.145) # 44.1 +- 7.9 - -m0 <- lm(y ~ x) -lod(m0) - -loq(m0) -loq(m, w.loq = 1.67) diff --git a/tests/massart97.Rout.save b/tests/massart97.Rout.save deleted file mode 100644 index ce99c30..0000000 --- a/tests/massart97.Rout.save +++ /dev/null @@ -1,128 +0,0 @@ - -R version 3.1.0 (2014-04-10) -- "Spring Dance" -Copyright (C) 2014 The R Foundation for Statistical Computing -Platform: x86_64-pc-linux-gnu (64-bit) - -R is free software and comes with ABSOLUTELY NO WARRANTY. -You are welcome to redistribute it under certain conditions. -Type 'license()' or 'licence()' for distribution details. - -R is a collaborative project with many contributors. -Type 'contributors()' for more information and -'citation()' on how to cite R or R packages in publications. - -Type 'demo()' for some demos, 'help()' for on-line help, or -'help.start()' for an HTML browser interface to help. -Type 'q()' to quit R. - -> require(chemCal) -Loading required package: chemCal -> data(massart97ex1) -> m <- lm(y ~ x, data = massart97ex1) -> inverse.predict(m, 15) # 6.1 +- 4.9 -$Prediction -[1] 6.09381 - -$`Standard Error` -[1] 1.767278 - -$Confidence -[1] 4.906751 - -$`Confidence Limits` -[1] 1.187059 11.000561 - -> inverse.predict(m, 90) # 43.9 +- 4.9 -$Prediction -[1] 43.93983 - -$`Standard Error` -[1] 1.767747 - -$Confidence -[1] 4.908053 - -$`Confidence Limits` -[1] 39.03178 48.84788 - -> inverse.predict(m, rep(90,5)) # 43.9 +- 3.2 -$Prediction -[1] 43.93983 - -$`Standard Error` -[1] 1.141204 - -$Confidence -[1] 3.168489 - -$`Confidence Limits` -[1] 40.77134 47.10832 - -> -> data(massart97ex3) -> attach(massart97ex3) -> yx <- split(y, x) -> ybar <- sapply(yx, mean) -> s <- round(sapply(yx, sd), digits = 2) -> w <- round(1 / (s^2), digits = 3) -> weights <- w[factor(x)] -> m <- lm(y ~ x, w = weights) -> #calplot(m) -> -> inverse.predict(m, 15, ws = 1.67) # 5.9 +- 2.5 -$Prediction -[1] 5.865367 - -$`Standard Error` -[1] 0.8926109 - -$Confidence -[1] 2.478285 - -$`Confidence Limits` -[1] 3.387082 8.343652 - -> inverse.predict(m, 90, ws = 0.145) # 44.1 +- 7.9 -$Prediction -[1] 44.06025 - -$`Standard Error` -[1] 2.829162 - -$Confidence -[1] 7.855012 - -$`Confidence Limits` -[1] 36.20523 51.91526 - -> -> m0 <- lm(y ~ x) -> lod(m0) -$x -[1] 5.407085 - -$y - 1 -13.63911 - -> -> loq(m0) -$x -[1] 13.97764 - -$y - 1 -30.6235 - -> loq(m, w.loq = 1.67) -$x -[1] 7.346195 - -$y - 1 -17.90777 - -> -> proc.time() - user system elapsed - 0.529 0.327 0.443 diff --git a/tests/testthat.R b/tests/testthat.R new file mode 100644 index 0000000..6176830 --- /dev/null +++ b/tests/testthat.R @@ -0,0 +1,4 @@ +library(testthat) +library(chemCal) + +test_check("chemCal") diff --git a/tests/testthat/test_din32645.R b/tests/testthat/test_din32645.R new file mode 100644 index 0000000..e6d2840 --- /dev/null +++ b/tests/testthat/test_din32645.R @@ -0,0 +1,33 @@ +context("Known results for the dataset provided in DIN 32645") + +require(chemCal) + +m <- lm(y ~ x, data = din32645) +prediction <- inverse.predict(m, 3500, alpha = 0.01) + +test_that("We get correct confidence intervals", { + # Result collected from Procontrol 3.1 (isomehr GmbH) + expect_equal(round(prediction$Confidence, 5), 0.07434) +}) + +test_that("We get a correct critical value", { + crit <- lod(m, alpha = 0.01, beta = 0.5) + # DIN 32645 gives 0.07 for the critical value + # (decision limit, "Nachweisgrenze") + expect_equal(round(crit$x, 2), 0.07) + # According to Dintest test data, we should get 0.0698 + expect_equal(round(crit$x, 4), 0.0698) +}) + +test_that("We get a correct smalles detectable value using the DIN method", { + lod.din <- lod(m, alpha = 0.01, beta = 0.01, method = "din") + # DIN 32645 gives 0.14 for the smallest detectable value ("Erfassungsgrenze") + expect_equal(round(lod.din$x, 2), 0.14) +}) + +test_that("We get a correct limit of quantification", { + loq.din <- loq(m, alpha = 0.01) + # The value cited for Procontrol 3.1 (0.2121) deviates + # at the last digit, so we only test for three digits + expect_equal(round(loq.din$x, 3), 0.212) +}) diff --git a/tests/testthat/test_inverse.predict.R b/tests/testthat/test_inverse.predict.R new file mode 100644 index 0000000..61484dc --- /dev/null +++ b/tests/testthat/test_inverse.predict.R @@ -0,0 +1,43 @@ +context("Inverse predictions") + +library(chemCal) + +test_that("Inverse predictions for unweighted regressions are stable", { + m1 <- lm(y ~ x, data = massart97ex1) + + # Known values from chemcal Version 0.1-37 + p1.1 <- inverse.predict(m1, 15) + expect_equal(signif(p1.1$Prediction, 7), 6.09381) + expect_equal(signif(p1.1$`Standard Error`, 7), 1.767278) + expect_equal(signif(p1.1$Confidence, 7), 4.906751) + + p1.2 <- inverse.predict(m1, 90) + expect_equal(signif(p1.2$Prediction, 7), 43.93983) + expect_equal(signif(p1.2$`Standard Error`, 7), 1.767747) + expect_equal(signif(p1.2$Confidence, 7), 4.908053) + + p1.3 <- inverse.predict(m1, rep(90, 5)) + expect_equal(signif(p1.3$Prediction, 7), 43.93983) + expect_equal(signif(p1.3$`Standard Error`, 7), 1.141204) + expect_equal(signif(p1.3$Confidence, 7), 3.168489) +}) + +test_that("Inverse predictions for weighted regressions are stable", { + attach(massart97ex3) + yx <- split(y, x) + ybar <- sapply(yx, mean) + s <- round(sapply(yx, sd), digits = 2) + w <- round(1 / (s^2), digits = 3) + weights <- w[factor(x)] + m3 <- lm(y ~ x, w = weights) + + p3.1 <- inverse.predict(m3, 15, ws = 1.67) + expect_equal(signif(p3.1$Prediction, 7), 5.865367) + expect_equal(signif(p3.1$`Standard Error`, 7), 0.8926109) + expect_equal(signif(p3.1$Confidence, 7), 2.478285) + + p3.2 <- inverse.predict(m3, 90, ws = 0.145) + expect_equal(signif(p3.2$Prediction, 7), 44.06025) + expect_equal(signif(p3.2$`Standard Error`, 7), 2.829162) + expect_equal(signif(p3.2$Confidence, 7), 7.855012) +}) diff --git a/tests/testthat/test_lod_loq.R b/tests/testthat/test_lod_loq.R new file mode 100644 index 0000000..6ba0ad0 --- /dev/null +++ b/tests/testthat/test_lod_loq.R @@ -0,0 +1,28 @@ +context("LOD and LOQ") + +library(chemCal) + +test_that("lod is stable across chemCal versions", { + m <- lm(y ~ x, data = din32645) + lod_1 <- lod(m) + expect_equal(signif(lod_1$x, 7), 0.08655484) + expect_equal(signif(lod_1$y, 7), 3317.154) + + # Critical value (decision limit, Nachweisgrenze) + lod_2 <- lod(m, alpha = 0.01, beta = 0.5) + expect_equal(signif(lod_2$x, 7), 0.0698127) + expect_equal(signif(lod_2$y, 7), 3155.393) +}) + +test_that("loq is stable across chemCal versions", { + m2 <- lm(y ~ x, data = massart97ex3) + loq_1 <- loq(m2) + expect_equal(signif(loq_1$x, 7), 13.97764) + expect_equal(signif(loq_1$y, 7), 30.6235) + + loq_2 <- loq(m2, n = 3) + expect_equal(signif(loq_2$x, 7), 9.971963) + expect_equal(signif(loq_2$y, 7), 22.68539) +}) + + diff --git a/tests/testthat/test_massart.R b/tests/testthat/test_massart.R new file mode 100644 index 0000000..791c2e7 --- /dev/null +++ b/tests/testthat/test_massart.R @@ -0,0 +1,40 @@ +context("Known results for the example datasets provided by Massart (1997)") + +require(chemCal) + +test_that("Inverse predictions for example 1 are correct",{ + m1 <- lm(y ~ x, data = massart97ex1) + + # Known values are from the book + p1.1 <- inverse.predict(m1, 15) + expect_equal(round(p1.1$Prediction, 1), 6.1) + expect_equal(round(p1.1$Confidence, 1), 4.9) + + p1.2 <- inverse.predict(m1, 90) + expect_equal(round(p1.2$Prediction, 1), 43.9) + expect_equal(round(p1.2$Confidence, 1), 4.9) + + p1.3 <- inverse.predict(m1, rep(90, 5)) + expect_equal(round(p1.3$Prediction, 1), 43.9) + expect_equal(round(p1.3$Confidence, 1), 3.2) +}) + + +test_that("Inverse predictions for example 3 are correct",{ + attach(massart97ex3) + yx <- split(y, x) + ybar <- sapply(yx, mean) + s <- round(sapply(yx, sd), digits = 2) + w <- round(1 / (s^2), digits = 3) + weights <- w[factor(x)] + m3 <- lm(y ~ x, w = weights) + + # Known values are from the book + p3.1 <- inverse.predict(m3, 15, ws = 1.67) + expect_equal(round(p3.1$Prediction, 1), 5.9) + expect_equal(round(p3.1$Confidence, 1), 2.5) + + p3.2 <- inverse.predict(m3, 90, ws = 0.145) + expect_equal(round(p3.2$Prediction, 1), 44.1) + expect_equal(round(p3.2$Confidence, 1), 7.9) +}) diff --git a/vignettes/chemCal.Rmd b/vignettes/chemCal.Rmd new file mode 100644 index 0000000..cea1678 --- /dev/null +++ b/vignettes/chemCal.Rmd @@ -0,0 +1,134 @@ +--- +title: Introduction to chemCal +author: Johannes Ranke +date: "`r Sys.Date()`" +output: + html_document: + toc: true + toc_float: true + code_folding: show + fig_retina: null +bibliography: refs.bib +vignette: > + %\VignetteEngine{knitr::rmarkdown} + %\VignetteIndexEntry{Introduction to chemCal} +--- + +[Wissenschaftlicher Berater, Kronacher Str. 12, 79639 Grenzach-Wyhlen, Germany](http://www.jrwb.de)
+ +```{r, include = FALSE} +require(knitr) +opts_chunk$set(engine='R', tidy=FALSE) +``` +# Basic calibration functions for analytical chemistry + +The `chemCal` package was first designed in the course of a lecture and lab +course on "Analytics of Organic Trace Contaminants" at the University of Bremen +from October to December 2004. In the fall 2005, an email exchange with +Ron Wehrens led to the belief that it would be desirable to implement the +inverse prediction method given in @massart97 since it also covers the +case of weighted regression. Studies of the IUPAC orange book and of DIN 32645 +as well as publications by Currie and the Analytical Method Committee of the +Royal Society of Chemistry and a nice paper by Castillo and Castells provided +further understanding of the matter. + +At the moment, the package consists of four functions, working on univariate +linear models of class `lm` or `rlm`, plus two datasets for +validation. + +A [bug report](http://bugs.r-project.org/bugzilla3/show_bug.cgi?id=8877) +and the following e-mail exchange on the r-devel mailing list about +prediction intervals from weighted regression entailed some further studies +on this subject. However, I did not encounter any proof or explanation of the +formula cited below yet, so I can't really confirm that Massart's method is correct. + +When calibrating an analytical method, the first task is to generate a suitable +model. If we want to use the `chemCal` functions, we will have to restrict +ourselves to univariate, possibly weighted, linear regression so far. + +Once such a model has been created, the calibration can be graphically +shown by using the `calplot` function: + +```{r} +library(chemCal) +m0 <- lm(y ~ x, data = massart97ex3) +calplot(m0) +``` + +As we can see, the scatter increases with increasing x. This is also +illustrated by one of the diagnostic plots for linear models +provided by R: + +```{r} +plot(m0, which=3) +``` + +Therefore, in Example 8 in @massart97, weighted regression +is proposed which can be reproduced by + +```{r, message = FALSE, echo = TRUE} +attach(massart97ex3) +yx <- split(y, x) +ybar <- sapply(yx, mean) +s <- round(sapply(yx, sd), digits = 2) +w <- round(1 / (s^2), digits = 3) +weights <- w[factor(x)] +m <- lm(y ~ x, w = weights) +``` + +If we now want to predict a new x value from measured y values, +we use the `inverse.predict` function: + +```{r} +inverse.predict(m, 15, ws=1.67) +inverse.predict(m, 90, ws = 0.145) +``` + +The weight `ws` assigned to the measured y value has to be +given by the user in the case of weighted regression, or alternatively, +the approximate variance `var.s` at this location. + +# Some theory for `inverse.predict` + +Equation 8.28 in @massart97 gives a general equation for predicting the +standard error $s_{\hat{x_s}}$ for an $x$ value predicted from measurements of +$y$ according to the linear calibration function $y = b_0 + b_1 \cdot x$: + +\begin{equation} +s_{\hat{x_s}} = \frac{s_e}{b_1} \sqrt{\frac{1}{w_s m} + \frac{1}{\sum{w_i}} + + \frac{(\bar{y_s} - \bar{y_w})^2 \sum{w_i}} + {{b_1}^2 \left( \sum{w_i} \sum{w_i {x_i}^2} - + {\left( \sum{ w_i x_i } \right)}^2 \right) }} +\end{equation} + +with + +\begin{equation} +s_e = \sqrt{ \frac{\sum w_i (y_i - \hat{y_i})^2}{n - 2}} +\end{equation} + +where $w_i$ is the weight for calibration standard $i$, $y_i$ is the mean $y$ +value (!) observed for standard $i$, $\hat{y_i}$ is the estimated value for +standard $i$, $n$ is the number calibration standards, $w_s$ is the weight +attributed to the sample $s$, $m$ is the number of replicate measurements of +sample $s$, $\bar{y_s}$ is the mean response for the sample, +$\bar{y_w} = \frac{\sum{w_i y_i}}{\sum{w_i}}$ is the weighted mean of responses +$y_i$, and $x_i$ is the given $x$ value for standard $i$. + +The weight $w_s$ for the sample should be estimated or calculated in accordance +to the weights used in the linear regression. + +I adjusted the above equation in order to be able to take a different +precisions in standards and samples into account. In analogy to Equation 8.26 +from \cite{massart97} we get + +\begin{equation} +s_{\hat{x_s}} = \frac{1}{b_1} \sqrt{\frac{{s_s}^2}{w_s m} + + {s_e}^2 \left( \frac{1}{\sum{w_i}} + + \frac{(\bar{y_s} - \bar{y_w})^2 \sum{w_i}} + {{b_1}^2 \left( \sum{w_i} \sum{w_i {x_i}^2} - {\left( \sum{ w_i x_i } \right)}^2 \right) } \right) } +\end{equation} + +where I interpret $\frac{{s_s}^2}{w_s}$ as an estimator of the variance at location +$\hat{x_s}$, which can be replaced by a user-specified value using the argument +`var.s` of the function `inverse.predict`. diff --git a/vignettes/chemCal.Rnw b/vignettes/chemCal.Rnw deleted file mode 100644 index 30e0331..0000000 --- a/vignettes/chemCal.Rnw +++ /dev/null @@ -1,130 +0,0 @@ -\documentclass[a4paper]{article} -%\VignetteIndexEntry{Short manual for the chemCal package} -\usepackage{hyperref} - -\title{Basic calibration functions for analytical chemistry} -\author{Johannes Ranke} - -\begin{document} -\maketitle - -The \texttt{chemCal} package was first designed in the course of a lecture and lab -course on "analytics of organic trace contaminants" at the University of Bremen -from October to December 2004. In the fall 2005, an email exchange with -Ron Wehrens led to the belief that it would be desirable to implement the -inverse prediction method given in \cite{massart97} since it also covers the -case of weighted regression. Studies of the IUPAC orange book and of DIN 32645 -as well as publications by Currie and the Analytical Method Committee of the -Royal Society of Chemistry and a nice paper by Castillo and Castells provided -further understanding of the matter. - -At the moment, the package consists of four functions, working on univariate -linear models of class \texttt{lm} or \texttt{rlm}, plus to datasets for -validation. - -A \href{http://bugs.r-project.org/bugzilla3/show_bug.cgi?id=8877}{bug -report (PR\#8877)} and the following e-mail exchange on the r-devel mailing list about -prediction intervals from weighted regression entailed some further studies -on this subject. However, I did not encounter any proof or explanation of the -formula cited below yet, so I can't really confirm that Massart's method is correct. - -When calibrating an analytical method, the first task is to generate a suitable -model. If we want to use the \texttt{chemCal} functions, we will have to restrict -ourselves to univariate, possibly weighted, linear regression so far. - -Once such a model has been created, the calibration can be graphically -shown by using the \texttt{calplot} function: - -<>= -library(chemCal) -data(massart97ex3) -m0 <- lm(y ~ x, data = massart97ex3) -calplot(m0) -@ - -As we can see, the scatter increases with increasing x. This is also -illustrated by one of the diagnostic plots for linear models -provided by R: - -<>= -plot(m0,which=3) -@ - -Therefore, in Example 8 in \cite{massart97} weighted regression -is proposed which can be reproduced by - -<<>>= -attach(massart97ex3) -yx <- split(y, x) -ybar <- sapply(yx, mean) -s <- round(sapply(yx, sd), digits = 2) -w <- round(1 / (s^2), digits = 3) -weights <- w[factor(x)] -m <- lm(y ~ x, w = weights) -@ - -If we now want to predict a new x value from measured y values, -we use the \texttt{inverse.predict} function: - -<<>>= -inverse.predict(m, 15, ws=1.67) -inverse.predict(m, 90, ws = 0.145) -@ - -The weight \texttt{ws} assigned to the measured y value has to be -given by the user in the case of weighted regression, or alternatively, -the approximate variance \texttt{var.s} at this location. - -\section*{Theory for \texttt{inverse.predict}} -Equation 8.28 in \cite{massart97} gives a general equation for predicting the -standard error $s_{\hat{x_s}}$ for an x value predicted from measurements of y -according to the linear calibration function $ y = b_0 + b_1 \cdot x$: - -\begin{equation} -s_{\hat{x_s}} = \frac{s_e}{b_1} \sqrt{\frac{1}{w_s m} + \frac{1}{\sum{w_i}} + - \frac{(\bar{y_s} - \bar{y_w})^2 \sum{w_i}} - {{b_1}^2 \left( \sum{w_i} \sum{w_i {x_i}^2} - - {\left( \sum{ w_i x_i } \right)}^2 \right) }} -\end{equation} - -with - -\begin{equation} -s_e = \sqrt{ \frac{\sum w_i (y_i - \hat{y_i})^2}{n - 2}} -\end{equation} - -where $w_i$ is the weight for calibration standard $i$, $y_i$ is the mean $y$ -value (!) observed for standard $i$, $\hat{y_i}$ is the estimated value for -standard $i$, $n$ is the number calibration standards, $w_s$ is the weight -attributed to the sample $s$, $m$ is the number of replicate measurements of -sample $s$, $\bar{y_s}$ is the mean response for the sample, -$\bar{y_w} = \frac{\sum{w_i y_i}}{\sum{w_i}}$ is the weighted mean of responses -$y_i$, and $x_i$ is the given $x$ value for standard $i$. - -The weight $w_s$ for the sample should be estimated or calculated in accordance -to the weights used in the linear regression. - -I adjusted the above equation in order to be able to take a different -precisions in standards and samples into account. In analogy to Equation 8.26 -from \cite{massart97} we get - -\begin{equation} -s_{\hat{x_s}} = \frac{1}{b_1} \sqrt{\frac{{s_s}^2}{w_s m} + - {s_e}^2 \left( \frac{1}{\sum{w_i}} + - \frac{(\bar{y_s} - \bar{y_w})^2 \sum{w_i}} - {{b_1}^2 \left( \sum{w_i} \sum{w_i {x_i}^2} - {\left( \sum{ w_i x_i } \right)}^2 \right) } \right) } -\end{equation} - -where I interpret $\frac{{s_s}^2}{w_s}$ as an estimator of the variance at location -$\hat{x_s}$, which can be replaced by a user-specified value using the argument -\texttt{var.s} of the function \texttt{inverse.predict}. - -\begin{thebibliography}{1} -\bibitem{massart97} -Massart, L.M, Vandenginste, B.G.M., Buydens, L.M.C., De Jong, S., Lewi, P.J., -Smeyers-Verbeke, J. -\newblock Handbook of Chemometrics and Qualimetrics: Part A, -\newblock Elsevier, Amsterdam, 1997 -\end{thebibliography} - -\end{document} diff --git a/vignettes/chemCal.html b/vignettes/chemCal.html new file mode 100644 index 0000000..222ca8f --- /dev/null +++ b/vignettes/chemCal.html @@ -0,0 +1,355 @@ + + + + + + + + + + + + + + + +Introduction to chemCal + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + +
+
+
+
+
+ +
+ + + + + + + +

Wissenschaftlicher Berater, Kronacher Str. 12, 79639 Grenzach-Wyhlen, Germany

+
+

Basic calibration functions for analytical chemistry

+

The chemCal package was first designed in the course of a lecture and lab course on “Analytics of Organic Trace Contaminants” at the University of Bremen from October to December 2004. In the fall 2005, an email exchange with Ron Wehrens led to the belief that it would be desirable to implement the inverse prediction method given in Massart et al. (1997) since it also covers the case of weighted regression. Studies of the IUPAC orange book and of DIN 32645 as well as publications by Currie and the Analytical Method Committee of the Royal Society of Chemistry and a nice paper by Castillo and Castells provided further understanding of the matter.

+

At the moment, the package consists of four functions, working on univariate linear models of class lm or rlm, plus two datasets for validation.

+

A bug report and the following e-mail exchange on the r-devel mailing list about prediction intervals from weighted regression entailed some further studies on this subject. However, I did not encounter any proof or explanation of the formula cited below yet, so I can’t really confirm that Massart’s method is correct.

+

When calibrating an analytical method, the first task is to generate a suitable model. If we want to use the chemCal functions, we will have to restrict ourselves to univariate, possibly weighted, linear regression so far.

+

Once such a model has been created, the calibration can be graphically shown by using the calplot function:

+
library(chemCal)
+m0 <- lm(y ~ x, data = massart97ex3)
+calplot(m0)
+

+

As we can see, the scatter increases with increasing x. This is also illustrated by one of the diagnostic plots for linear models provided by R:

+
plot(m0, which=3)
+

+

Therefore, in Example 8 in Massart et al. (1997), weighted regression is proposed which can be reproduced by

+
attach(massart97ex3)
+yx <- split(y, x)
+ybar <- sapply(yx, mean)
+s <- round(sapply(yx, sd), digits = 2)
+w <- round(1 / (s^2), digits = 3)
+weights <- w[factor(x)]
+m <- lm(y ~ x, w = weights)
+

If we now want to predict a new x value from measured y values, we use the inverse.predict function:

+
inverse.predict(m, 15, ws=1.67)
+
## $Prediction
+## [1] 5.865367
+## 
+## $`Standard Error`
+## [1] 0.8926109
+## 
+## $Confidence
+## [1] 2.478285
+## 
+## $`Confidence Limits`
+## [1] 3.387082 8.343652
+
inverse.predict(m, 90, ws = 0.145)
+
## $Prediction
+## [1] 44.06025
+## 
+## $`Standard Error`
+## [1] 2.829162
+## 
+## $Confidence
+## [1] 7.855012
+## 
+## $`Confidence Limits`
+## [1] 36.20523 51.91526
+

The weight ws assigned to the measured y value has to be given by the user in the case of weighted regression, or alternatively, the approximate variance var.s at this location.

+
+
+

Some theory for inverse.predict

+

Equation 8.28 in Massart et al. (1997) gives a general equation for predicting the standard error \(s_{\hat{x_s}}\) for an \(x\) value predicted from measurements of \(y\) according to the linear calibration function \(y = b_0 + b_1 \cdot x\):

+\[\begin{equation} +s_{\hat{x_s}} = \frac{s_e}{b_1} \sqrt{\frac{1}{w_s m} + \frac{1}{\sum{w_i}} + + \frac{(\bar{y_s} - \bar{y_w})^2 \sum{w_i}} + {{b_1}^2 \left( \sum{w_i} \sum{w_i {x_i}^2} - + {\left( \sum{ w_i x_i } \right)}^2 \right) }} +\end{equation}\] +

with

+\[\begin{equation} +s_e = \sqrt{ \frac{\sum w_i (y_i - \hat{y_i})^2}{n - 2}} +\end{equation}\] +

where \(w_i\) is the weight for calibration standard \(i\), \(y_i\) is the mean \(y\) value (!) observed for standard \(i\), \(\hat{y_i}\) is the estimated value for standard \(i\), \(n\) is the number calibration standards, \(w_s\) is the weight attributed to the sample \(s\), \(m\) is the number of replicate measurements of sample \(s\), \(\bar{y_s}\) is the mean response for the sample, \(\bar{y_w} = \frac{\sum{w_i y_i}}{\sum{w_i}}\) is the weighted mean of responses \(y_i\), and \(x_i\) is the given \(x\) value for standard \(i\).

+

The weight \(w_s\) for the sample should be estimated or calculated in accordance to the weights used in the linear regression.

+

I adjusted the above equation in order to be able to take a different precisions in standards and samples into account. In analogy to Equation 8.26 from we get

+\[\begin{equation} +s_{\hat{x_s}} = \frac{1}{b_1} \sqrt{\frac{{s_s}^2}{w_s m} + + {s_e}^2 \left( \frac{1}{\sum{w_i}} + + \frac{(\bar{y_s} - \bar{y_w})^2 \sum{w_i}} + {{b_1}^2 \left( \sum{w_i} \sum{w_i {x_i}^2} - {\left( \sum{ w_i x_i } \right)}^2 \right) } \right) } +\end{equation}\] +

where I interpret \(\frac{{s_s}^2}{w_s}\) as an estimator of the variance at location \(\hat{x_s}\), which can be replaced by a user-specified value using the argument var.s of the function inverse.predict.

+
+
+

Massart, D. L, B. G. M. Vandeginste, L. M. C. Buydens, S. De Jong, P. J. Lewi, and J Smeyers-Verbeke. 1997. Handbook of Chemometrics and Qualimetrics: Part A. Amsterdam: Elsevier.

+
+
+
+ + + +
+
+ +
+ + + + + + + + diff --git a/vignettes/chemCal.pdf b/vignettes/chemCal.pdf deleted file mode 100644 index cf1a413..0000000 Binary files a/vignettes/chemCal.pdf and /dev/null differ diff --git a/vignettes/figure/unnamed-chunk-1-1.pdf b/vignettes/figure/unnamed-chunk-1-1.pdf new file mode 100644 index 0000000..c70b645 Binary files /dev/null and b/vignettes/figure/unnamed-chunk-1-1.pdf differ diff --git a/vignettes/figure/unnamed-chunk-2-1.pdf b/vignettes/figure/unnamed-chunk-2-1.pdf new file mode 100644 index 0000000..c1b934d Binary files /dev/null and b/vignettes/figure/unnamed-chunk-2-1.pdf differ diff --git a/vignettes/refs.bib b/vignettes/refs.bib new file mode 100644 index 0000000..514d76b --- /dev/null +++ b/vignettes/refs.bib @@ -0,0 +1,7 @@ +@book{massart97, + author = "Massart, D. L and Vandeginste, B. G. M. and Buydens, L. M. C. and De Jong, S. and Lewi, P. J. and Smeyers-Verbeke, J", + title = "Handbook of {C}hemometrics and {Q}ualimetrics: Part {A}", + publisher = "Elsevier", + address = "Amsterdam", + year = "1997" +} -- cgit v1.2.1