Complete mkin vignette, update FOCUS L vignette

The mkin vignette now describes how confidence intervals and the t-test are calculated. The FOCUS L vignette is updated with the new floating toc provided by the current rmarkdown package, and also uses current, improved mkin functionality.
author: Johannes Ranke <jranke@uni-bremen.de> 2016-06-27 19:06:42 +0200
committer: Johannes Ranke <jranke@uni-bremen.de> 2016-06-27 19:06:42 +0200
commit: 280f609d8e363bb28bee543d1fa3651762469198 (patch)
tree: d227ee7d1fffa7963d2c56ff837022fccc160f1e /vignettes/FOCUS_L.Rmd
parent: 4b0b5346a9f026c5a19d452e4649326fe56d464c (diff)
1 files changed, 109 insertions, 93 deletions
diff --git a/vignettes/FOCUS_L.Rmd b/vignettes/FOCUS_L.Rmd
index f2981c10..387c8fc7 100644
--- a/vignettes/FOCUS_L.Rmd
+++ b/vignettes/FOCUS_L.Rmd
@@ -4,10 +4,19 @@ author: "Johannes Ranke"
 date: "`r Sys.Date()`"
 output:
   html_document:
-    css: mkin_vignettes.css
     toc: true
+    toc_float: true
     mathjax: null
-    theme: united
+references:
+- id: ranke2014
+  title: <span class="nocase">Prüfung und Validierung von Modellierungssoftware als Alternative zu
+    ModelMaker 4.0</span>
+  author:
+  - family: Ranke
+    given: Johannes
+  type: report
+  year: 2014
+  number: Umweltbundesamt Projektnummer 27452
 vignette: >
   %\VignetteIndexEntry{Example evaluation of FOCUS Laboratory Data L1 to L3}
   %\VignetteEngine{knitr::rmarkdown}
@@ -19,13 +28,13 @@ library(knitr)
 opts_chunk$set(tidy = FALSE, cache = TRUE)
 ```
 
-## Laboratory Data L1
+# Laboratory Data L1
 
 The following code defines example dataset L1 from the FOCUS kinetics
 report, p. 284:
 
 ```{r}
-library("mkin")
+library("mkin", quietly = TRUE)
 FOCUS_2006_L1 = data.frame(
   t = rep(c(0, 1, 2, 3, 5, 7, 14, 21, 30), each = 2),
   parent = c(88.3, 91.4, 85.6, 84.5, 78.9, 77.6, 
@@ -39,52 +48,63 @@ rate constant over time (FOMC) and the case of two different phases of the
 kinetics (DFOP). For a more detailed discussion of the models, please see the
 FOCUS kinetics report.
 
-Since mkin version 0.9-32 (July 2014), we can use shorthand notation like `SFO`
+Since mkin version 0.9-32 (July 2014), we can use shorthand notation like `"SFO"`
 for parent only degradation models. The following two lines fit the model and
 produce the summary report of the model fit. This covers the numerical analysis
 given in the FOCUS report. 
 
 ```{r}
-m.L1.SFO <- mkinfit("SFO", FOCUS_2006_L1_mkin, quiet=TRUE)
+m.L1.SFO <- mkinfit("SFO", FOCUS_2006_L1_mkin, quiet = TRUE)
 summary(m.L1.SFO)
 ```
 
 A plot of the fit is obtained with the plot function for mkinfit objects.
 
-```{r fig.width=7, fig.height = 5}
-plot(m.L1.SFO)
+```{r fig.width = 6, fig.height = 5}
+plot(m.L1.SFO, show_errmin = TRUE, main = "FOCUS L1 - SFO")
 ```
+
 The residual plot can be easily obtained by
 
-```{r fig.width=7, fig.height = 5}
+```{r fig.width = 6, fig.height = 5}
 mkinresplot(m.L1.SFO, ylab = "Observed", xlab = "Time")
 ```
 
-For comparison, the FOMC model is fitted as well, and the chi^2 error level
+For comparison, the FOMC model is fitted as well, and the $\chi^2$ error level
 is checked.
 
-```{r}
+```{r fig.width = 6, fig.height = 5}
 m.L1.FOMC <- mkinfit("FOMC", FOCUS_2006_L1_mkin, quiet=TRUE)
+plot(m.L1.FOMC, show_errmin = TRUE, main = "FOCUS L1 - FOMC")
 summary(m.L1.FOMC, data = FALSE)
 ```
 
-Due to the higher number of parameters, and the lower number of degrees of
-freedom of the fit, the chi^2 error level is actually higher for the FOMC
-model (3.6%) than for the SFO model (3.4%). Additionally, the parameters
-`log_alpha` and `log_beta` internally fitted in the model have p-values for the two
-sided t-test of 0.18 and 0.125, and their correlation is 1.000, indicating that
-the model is overparameterised. 
-
-The chi^2 error levels reported in Appendix 3 and Appendix 7 to the FOCUS
+We get a warning that the default optimisation algorithm `Port` did not converge, which
+is an indication that the model is overparameterised, *i.e.* contains too many 
+parameters that are ill-defined as a consequence.
+
+And in fact, due to the higher number of parameters, and the lower number of
+degrees of freedom of the fit, the $\chi^2$ error level is actually higher for
+the FOMC model (3.6%) than for the SFO model (3.4%). Additionally, the
+parameters `log_alpha` and `log_beta` internally fitted in the model have
+excessive confidence intervals, that span more than 25 orders of magnitude (!)
+when backtransformed to the scale of `alpha` and `beta`. Also, the t-test
+for significant difference from zero does not indicate such a significant difference,
+with p-values greater than 0.1, and finally, the parameter correlation of `log_alpha`
+and `log_beta` is 1.000, clearly indicating that the model is overparameterised. 
+
+The $\chi^2$ error levels reported in Appendix 3 and Appendix 7 to the FOCUS
 kinetics report are rounded to integer percentages and partly deviate by one
 percentage point from the results calculated by mkin. The reason for
-this is not known. However, mkin gives the same chi^2 error levels
-as the kinfit package.  Furthermore, the calculation routines of the kinfit
-package have been extensively compared to the results obtained by the KinGUI
-software, as documented in the kinfit package vignette. KinGUI is a widely used
-standard package in this field. 
+this is not known. However, mkin gives the same $\chi^2$ error levels
+as the kinfit package and the calculation routines of the kinfit package have
+been extensively compared to the results obtained by the KinGUI
+software, as documented in the kinfit package vignette. KinGUI was the first
+widely used standard package in this field. Also, the calculation of
+$\chi^2$ error levels was compared with KinGUII, CAKE and DegKin manager in 
+a project sponsored by the German Umweltbundesamt [@ranke2014].
 
-## Laboratory Data L2
+# Laboratory Data L2
 
 The following code defines example dataset L2 from the FOCUS kinetics
 report, p. 287:
@@ -98,21 +118,21 @@ FOCUS_2006_L2 = data.frame(
 FOCUS_2006_L2_mkin <- mkin_wide_to_long(FOCUS_2006_L2)
 ```
 
-Again, the SFO model is fitted and a summary is obtained:
+## SFO fit for L2
 
-```{r}
+Again, the SFO model is fitted and the result is plotted. The residual plot
+can be obtained simply by adding the argument `show_residuals` to the plot
+command.
+
+```{r fig.width = 7, fig.height = 6}
 m.L2.SFO <- mkinfit("SFO", FOCUS_2006_L2_mkin, quiet=TRUE)
-summary(m.L2.SFO)
+plot(m.L2.SFO, show_residuals = TRUE, show_errmin = TRUE, 
+     main = "FOCUS L2 - SFO")
 ```
 
-The chi^2 error level of 14% suggests that the model does not fit very well.
-This is also obvious from the plots of the fit and the residuals.
-
-```{r fig.height = 8}
-par(mfrow = c(2, 1))
-plot(m.L2.SFO)
-mkinresplot(m.L2.SFO)
-```
+The $\chi^2$ error level of 14% suggests that the model does not fit very well.
+This is also obvious from the plots of the fit, in which we have included 
+the residual plot.
 
 In the FOCUS kinetics report, it is stated that there is no apparent systematic
 error observed from the residual plot up to the measured DT90 (approximately at
@@ -124,37 +144,30 @@ priori_ why a consistent underestimation after the approximate DT90 should be
 irrelevant. However, this can be rationalised by the fact that the FOCUS fate
 models generally only implement SFO kinetics.
 
-For comparison, the FOMC model is fitted as well, and the chi^2 error level
+## FOMC fit for L2
+
+For comparison, the FOMC model is fitted as well, and the $\chi^2$ error level
 is checked.
 
-```{r fig.height = 8}
+```{r fig.width = 7, fig.height = 6}
 m.L2.FOMC <- mkinfit("FOMC", FOCUS_2006_L2_mkin, quiet = TRUE)
-par(mfrow = c(2, 1))
-plot(m.L2.FOMC)
-mkinresplot(m.L2.FOMC)
+plot(m.L2.FOMC, show_residuals = TRUE,
+     main = "FOCUS L2 - FOMC")
 summary(m.L2.FOMC, data = FALSE)
 ```
 
-The error level at which the chi^2 test passes is much lower in this case.
+The error level at which the $\chi^2$ test passes is much lower in this case.
 Therefore, the FOMC model provides a better description of the data, as less
 experimental error has to be assumed in order to explain the data.
 
-Fitting the four parameter DFOP model further reduces the chi^2 error level. 
-
-```{r fig.height = 5}
-m.L2.DFOP <- mkinfit("DFOP", FOCUS_2006_L2_mkin, quiet = TRUE)
-plot(m.L2.DFOP)
-```
+## DFOP fit for L2
 
-Here, the default starting parameters for the DFOP model obviously do not lead
-to a reasonable solution. Therefore the fit is repeated with different starting
-parameters.
+Fitting the four parameter DFOP model further reduces the $\chi^2$ error level. 
 
-```{r fig.height = 5}
-m.L2.DFOP <- mkinfit("DFOP", FOCUS_2006_L2_mkin, 
-  parms.ini = c(k1 = 1, k2 = 0.01, g = 0.8),
-  quiet=TRUE)
-plot(m.L2.DFOP)
+```{r fig.width = 7, fig.height = 6}
+m.L2.DFOP <- mkinfit("DFOP", FOCUS_2006_L2_mkin, quiet = TRUE)
+plot(m.L2.DFOP, show_residuals = TRUE, show_errmin = TRUE,
+     main = "FOCUS L2 - DFOP")
 summary(m.L2.DFOP, data = FALSE)
 ```
 
@@ -163,7 +176,7 @@ chi^2 error level criterion. However, the failure to calculate the covariance
 matrix indicates that the parameter estimates correlate excessively. Therefore,
 the FOMC model may be preferred for this dataset.
 
-## Laboratory Data L3
+# Laboratory Data L3
 
 The following code defines example dataset L3 from the FOCUS kinetics report,
 p. 290.
@@ -175,39 +188,42 @@ FOCUS_2006_L3 = data.frame(
 FOCUS_2006_L3_mkin <- mkin_wide_to_long(FOCUS_2006_L3)
 ```
 
-SFO model, summary and plot:
+## Use mmkin to fit multiple models
 
-```{r fig.height = 5}
-m.L3.SFO <- mkinfit("SFO", FOCUS_2006_L3_mkin, quiet = TRUE)
-plot(m.L3.SFO)
-summary(m.L3.SFO)
+As of mkin version 0.9-39 (June 2015), we can fit several models to 
+one or more datasets in one call to the function `mmkin`. The datasets
+have to be passed in a list, in this case a named list holding only
+the L3 dataset prepared above.
+
+```{r fig.height = 8}
+mm.L3 <- mmkin(c("SFO", "FOMC", "DFOP"), 
+               list("FOCUS L3" = FOCUS_2006_L3_mkin), quiet = TRUE)
+plot(mm.L3)
 ```
 
-The chi^2 error level of 21% as well as the plot suggest that the model
-does not fit very well. 
+The $\chi^2$ error level of 21% as well as the plot suggest that the SFO model
+does not fit very well.  The FOMC model performs better, with an
+error level at which the $\chi^2$ test passes of 7%.  Fitting the four
+parameter DFOP model further reduces the $\chi^2$ error level
+considerably.
 
-The FOMC model performs better:
+## Accessing elements of mmkin objects
 
-```{r fig.height = 5}
-m.L3.FOMC <- mkinfit("FOMC", FOCUS_2006_L3_mkin, quiet = TRUE)
-plot(m.L3.FOMC)
-summary(m.L3.FOMC, data = FALSE)
-```
-
-The error level at which the chi^2 test passes is 7% in this case.
+The objects returned by mmkin are arranged like a matrix, with 
+models as a row index and datasets as a column index.
 
-Fitting the four parameter DFOP model further reduces the chi^2 error level
-considerably:
+We can extract the summary and plot for *e.g.* the DFOP fit,
+using square brackets for indexing which will result in the use of
+the summary and plot functions working on mkinfit objects.
 
 ```{r fig.height = 5}
-m.L3.DFOP <- mkinfit("DFOP", FOCUS_2006_L3_mkin, quiet = TRUE)
-plot(m.L3.DFOP)
-summary(m.L3.DFOP, data = FALSE)
+summary(mm.L3[["DFOP", 1]])
+plot(mm.L3[["DFOP", 1]], show_errmin = TRUE)
 ```
 
 Here, a look to the model plot, the confidence intervals of the parameters 
 and the correlation matrix suggest that the parameter estimates are reliable, and
-the DFOP model can be used as the best-fit model based on the chi^2 error
+the DFOP model can be used as the best-fit model based on the $\chi^2$ error
 level criterion for laboratory data L3.
 
 This is also an example where the standard t-test for the parameter `g_ilr` is
@@ -215,7 +231,7 @@ misleading, as it tests for a significant difference from zero. In this case,
 zero appears to be the correct value for this parameter, and the confidence 
 interval for the backtransformed parameter `g` is quite narrow.
 
-## Laboratory Data L4
+# Laboratory Data L4
 
 The following code defines example dataset L4 from the FOCUS kinetics
 report, p. 293:
@@ -227,24 +243,24 @@ FOCUS_2006_L4 = data.frame(
 FOCUS_2006_L4_mkin <- mkin_wide_to_long(FOCUS_2006_L4)
 ```
 
-SFO model, summary and plot:
+Fits of the SFO and FOMC models, plots and summaries are produced below:
 
-```{r fig.height = 5}
-m.L4.SFO <- mkinfit("SFO", FOCUS_2006_L4_mkin, quiet = TRUE)
-plot(m.L4.SFO)
-summary(m.L4.SFO, data = FALSE)
+```{r fig.height = 6}
+mm.L4 <- mmkin(c("SFO", "FOMC"), 
+               list("FOCUS L4" = FOCUS_2006_L4_mkin), 
+               quiet = TRUE)
+plot(mm.L4)
 ```
 
-The chi^2 error level of 3.3% as well as the plot suggest that the model
-fits very well. 
+The $\chi^2$ error level of 3.3% as well as the plot suggest that the SFO model
+fits very well.  The error level at which the $\chi^2$ test passes is slightly
+lower for the FOMC model. However, the difference appears negligible.
 
-The FOMC model for comparison:
 
-```{r fig.height = 5}
-m.L4.FOMC <- mkinfit("FOMC", FOCUS_2006_L4_mkin, quiet = TRUE)
-plot(m.L4.FOMC)
-summary(m.L4.FOMC, data = FALSE)
+```{r fig.height = 8}
+summary(mm.L4[["SFO", 1]], data = FALSE)
+summary(mm.L4[["FOMC", 1]], data = FALSE)
 ```
 
-The error level at which the chi^2 test passes is slightly lower for the FOMC 
-model. However, the difference appears negligible.
+
+# References
author	Johannes Ranke <jranke@uni-bremen.de>	2016-06-27 19:06:42 +0200
committer	Johannes Ranke <jranke@uni-bremen.de>	2016-06-27 19:06:42 +0200
commit	280f609d8e363bb28bee543d1fa3651762469198 (patch)
tree	d227ee7d1fffa7963d2c56ff837022fccc160f1e /vignettes/FOCUS_L.Rmd
parent	4b0b5346a9f026c5a19d452e4649326fe56d464c (diff)