inst/doc/chemCal.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154

\documentclass[a4paper]{article}
%\VignetteIndexEntry{Short manual for the chemCal package}
\usepackage{hyperref}

\title{Basic calibration functions for analytical chemistry}
\author{Johannes Ranke}

\usepackage{/usr/share/R/share/texmf/Sweave}
\begin{document}
\maketitle

When calibrating an analytical method, the first task is to generate
a suitable model. If we want to use the \texttt{chemCal} functions, we
will have to restrict ourselves to univariate, possibly weighted
\footnote{
For the weighted case, the function \texttt{predict.lm} would have to be 
adapted (Bug report PR\#8877), in order to allow for weights for the x values
used to predict the y values. This affects the functions \texttt{calplot}
and \texttt{lod}.
}, linear regression so far.

Once such a model has been created, the calibration can be graphically
shown by using the \texttt{calplot} function:

\begin{Schunk}
\begin{Sinput}
> library(chemCal)
> data(massart97ex3)
> attach(massart97ex3)
> m0 <- lm(y ~ x)
> calplot(m0)
\end{Sinput}
\end{Schunk}
\includegraphics{chemCal-001}

As we can see, the scatter increases with increasing x. This is also
illustrated by one of the diagnostic plots for linear models 
provided by R: 

\begin{Schunk}
\begin{Sinput}
> plot(m0, which = 3)
\end{Sinput}
\end{Schunk}
\includegraphics{chemCal-002}

Therefore, in Example 8 in \cite{massart97} weighted regression
is proposed which can be reproduced by

\begin{Schunk}
\begin{Sinput}
> yx <- split(y, x)
> ybar <- sapply(yx, mean)
> s <- round(sapply(yx, sd), digits = 2)
> w <- round(1/(s^2), digits = 3)
> weights <- w[factor(x)]
> m <- lm(y ~ x, w = weights)
\end{Sinput}
\end{Schunk}

Unfortunately, \texttt{calplot} does not work on weighted linear models,
as noted in the footnote above.

If we now want to predict a new x value from measured y values,
we use the \texttt{inverse.predict} function:

\begin{Schunk}
\begin{Sinput}
> inverse.predict(m, 15, ws = 1.67)
\end{Sinput}
\begin{Soutput}
$Prediction
[1] 5.865367

$`Standard Error`
[1] 0.892611

$Confidence
[1] 2.478285

$`Confidence Limits`
[1] 3.387082 8.343652
\end{Soutput}
\begin{Sinput}
> inverse.predict(m, 90, ws = 0.145)
\end{Sinput}
\begin{Soutput}
$Prediction
[1] 44.06025

$`Standard Error`
[1] 2.829162

$Confidence
[1] 7.855012

$`Confidence Limits`
[1] 36.20523 51.91526
\end{Soutput}
\end{Schunk}

The weight \texttt{ws} assigned to the measured y value has to be 
given by the user in the case of weighted regression. By default, 
the mean of the weights used in the linear regression is used.

\section*{Theory for \texttt{inverse.predict}}
Equation 8.28 in \cite{massart97} gives a general equation for predicting the
standard error $s_{\hat{x_s}}$ for an x value predicted from measurements of y
according to the linear calibration function $ y = b_0 + b_1 \cdot x$:

\begin{equation}
s_{\hat{x_s}} = \frac{s_e}{b_1} \sqrt{\frac{1}{w_s m} + \frac{1}{\sum{w_i}} +
    \frac{(\bar{y_s} - \bar{y_w})^2 \sum{w_i}}
        {{b_1}^2 \left( \sum{w_i} \sum{w_i {x_i}^2} - 
            {\left( \sum{ w_i x_i } \right)}^2 \right) }}
\end{equation}

with

\begin{equation}
s_e = \sqrt{ \frac{\sum w_i (y_i - \hat{y_i})^2}{n - 2}}
\end{equation}

where $w_i$ is the weight for calibration standard $i$, $y_i$ is the mean $y$
value (!) observed for standard $i$, $\hat{y_i}$ is the estimated value for
standard $i$, $n$ is the number calibration standards, $w_s$ is the weight
attributed to the sample $s$, $m$ is the number of replicate measurements of
sample $s$, $\bar{y_s}$ is the mean response for the sample, 
$\bar{y_w} = \frac{\sum{w_i y_i}}{\sum{w_i}}$ is the weighted mean of responses
$y_i$, and $x_i$ is the given $x$ value for standard $i$.

The weight $w_s$ for the sample should be estimated or calculated in accordance
to the weights used in the linear regression. 

I adjusted the above equation in order to be able to take a different
precisions in standards and samples into account. In analogy to Equation 8.26
from \cite{massart97} we get

\begin{equation}
s_{\hat{x_s}} = \frac{1}{b_1} \sqrt{\frac{{s_s}^2}{w_s m} + 
    {s_e}^2 \left( \frac{1}{\sum{w_i}} +
        \frac{(\bar{y_s} - \bar{y_w})^2 \sum{w_i}}
            {{b_1}^2 \left( \sum{w_i} \sum{w_i {x_i}^2} - {\left( \sum{ w_i x_i } \right)}^2 \right) } \right) }
\end{equation}

\begin{thebibliography}{1}
\bibitem{massart97}
Massart, L.M, Vandenginste, B.G.M., Buydens, L.M.C., De Jong, S., Lewi, P.J.,
Smeyers-Verbeke, J. 
\newblock Handbook of Chemometrics and Qualimetrics: Part A,
\newblock Elsevier, Amsterdam, 1997
\end{thebibliography}

\end{document}