1 files changed, 134 insertions, 0 deletions
diff --git a/vignettes/chemCal.Rmd b/vignettes/chemCal.Rmd
new file mode 100644
index 0000000..cea1678
--- /dev/null
+++ b/vignettes/chemCal.Rmd
@@ -0,0 +1,134 @@
+---
+title: Introduction to chemCal
+author: Johannes Ranke
+date: "`r Sys.Date()`"
+output:
+  html_document:
+    toc: true
+    toc_float: true
+    code_folding: show
+    fig_retina: null
+bibliography: refs.bib
+vignette: >
+  %\VignetteEngine{knitr::rmarkdown}
+  %\VignetteIndexEntry{Introduction to chemCal}
+---
+
+[Wissenschaftlicher Berater, Kronacher Str. 12, 79639 Grenzach-Wyhlen, Germany](http://www.jrwb.de)<br />
+
+```{r, include = FALSE}
+require(knitr)
+opts_chunk$set(engine='R', tidy=FALSE)
+```
+# Basic calibration functions for analytical chemistry
+
+The `chemCal` package was first designed in the course of a lecture and lab
+course on "Analytics of Organic Trace Contaminants" at the University of Bremen
+from October to December 2004. In the fall 2005, an email exchange with
+Ron Wehrens led to the belief that it would be desirable to implement the
+inverse prediction method given in @massart97 since it also covers the
+case of weighted regression. Studies of the IUPAC orange book and of DIN 32645
+as well as publications by Currie and the Analytical Method Committee of the
+Royal Society of Chemistry and a nice paper by Castillo and Castells provided
+further understanding of the matter.
+
+At the moment, the package consists of four functions, working on univariate
+linear models of class `lm` or `rlm`, plus two datasets for
+validation.
+
+A [bug report](http://bugs.r-project.org/bugzilla3/show_bug.cgi?id=8877)
+and the following e-mail exchange on the r-devel mailing list about
+prediction intervals from weighted regression entailed some further studies
+on this subject. However, I did not encounter any proof or explanation of the
+formula cited below yet, so I can't really confirm that Massart's method is correct.
+
+When calibrating an analytical method, the first task is to generate a suitable
+model. If we want to use the `chemCal` functions, we will have to restrict
+ourselves to univariate, possibly weighted, linear regression so far.
+
+Once such a model has been created, the calibration can be graphically
+shown by using the `calplot` function:
+
+```{r}
+library(chemCal)
+m0 <- lm(y ~ x, data = massart97ex3)
+calplot(m0)
+```
+
+As we can see, the scatter increases with increasing x. This is also
+illustrated by one of the diagnostic plots for linear models
+provided by R:
+
+```{r}
+plot(m0, which=3)
+```
+
+Therefore, in Example 8 in @massart97, weighted regression
+is proposed which can be reproduced by
+
+```{r, message = FALSE, echo = TRUE}
+attach(massart97ex3)
+yx <- split(y, x)
+ybar <- sapply(yx, mean)
+s <- round(sapply(yx, sd), digits = 2)
+w <- round(1 / (s^2), digits = 3)
+weights <- w[factor(x)]
+m <- lm(y ~ x, w = weights)
+```
+
+If we now want to predict a new x value from measured y values,
+we use the `inverse.predict` function:
+
+```{r}
+inverse.predict(m, 15, ws=1.67)
+inverse.predict(m, 90, ws = 0.145)
+```
+
+The weight `ws` assigned to the measured y value has to be
+given by the user in the case of weighted regression, or alternatively,
+the approximate variance `var.s` at this location.
+
+# Some theory for `inverse.predict`
+
+Equation 8.28 in @massart97 gives a general equation for predicting the
+standard error $s_{\hat{x_s}}$ for an $x$ value predicted from measurements of
+$y$ according to the linear calibration function $y = b_0 + b_1 \cdot x$:
+
+\begin{equation}
+s_{\hat{x_s}} = \frac{s_e}{b_1} \sqrt{\frac{1}{w_s m} + \frac{1}{\sum{w_i}} +
+    \frac{(\bar{y_s} - \bar{y_w})^2 \sum{w_i}}
+        {{b_1}^2 \left( \sum{w_i} \sum{w_i {x_i}^2} -
+            {\left( \sum{ w_i x_i } \right)}^2 \right) }}
+\end{equation}
+
+with
+
+\begin{equation}
+s_e = \sqrt{ \frac{\sum w_i (y_i - \hat{y_i})^2}{n - 2}}
+\end{equation}
+
+where $w_i$ is the weight for calibration standard $i$, $y_i$ is the mean $y$
+value (!) observed for standard $i$, $\hat{y_i}$ is the estimated value for
+standard $i$, $n$ is the number calibration standards, $w_s$ is the weight
+attributed to the sample $s$, $m$ is the number of replicate measurements of
+sample $s$, $\bar{y_s}$ is the mean response for the sample,
+$\bar{y_w} = \frac{\sum{w_i y_i}}{\sum{w_i}}$ is the weighted mean of responses
+$y_i$, and $x_i$ is the given $x$ value for standard $i$.
+
+The weight $w_s$ for the sample should be estimated or calculated in accordance
+to the weights used in the linear regression.
+
+I adjusted the above equation in order to be able to take a different
+precisions in standards and samples into account. In analogy to Equation 8.26
+from \cite{massart97} we get
+
+\begin{equation}
+s_{\hat{x_s}} = \frac{1}{b_1} \sqrt{\frac{{s_s}^2}{w_s m} +
+    {s_e}^2 \left( \frac{1}{\sum{w_i}} +
+        \frac{(\bar{y_s} - \bar{y_w})^2 \sum{w_i}}
+            {{b_1}^2 \left( \sum{w_i} \sum{w_i {x_i}^2} - {\left( \sum{ w_i x_i } \right)}^2 \right) } \right) }
+\end{equation}
+
+where I interpret $\frac{{s_s}^2}{w_s}$ as an estimator of the variance at location
+$\hat{x_s}$, which can be replaced by a user-specified value using the argument
+`var.s` of the function `inverse.predict`.