Further formatting improvement of benchmark vignette

Also, use .rmd extension instead of .Rmd for vignettes.
author: Johannes Ranke <jranke@uni-bremen.de> 2020-05-13 16:20:23 +0200
committer: Johannes Ranke <jranke@uni-bremen.de> 2020-05-13 16:20:23 +0200
commit: 218a9c55bd80fb708b15fa7196422f759bfe4b27 (patch)
tree: ad4b2aa4b561b3118d1ca8ee5e6b34fbd2dfcfe8 /vignettes/FOCUS_L.rmd
parent: 36bc31c52cbe4b686f5562e21ee110380481dff8 (diff)
1 files changed, 271 insertions, 0 deletions
diff --git a/vignettes/FOCUS_L.rmd b/vignettes/FOCUS_L.rmd
new file mode 100644
index 00000000..e017b674
--- /dev/null
+++ b/vignettes/FOCUS_L.rmd
@@ -0,0 +1,271 @@
+---
+title: "Example evaluation of FOCUS Laboratory Data L1 to L3"
+author: "Johannes Ranke"
+date: "`r Sys.Date()`"
+output:
+  html_document:
+    toc: true
+    toc_float:
+      collapsed: false
+    mathjax: null
+    fig_retina: null
+references:
+- id: ranke2014
+  title: <span class="nocase">Prüfung und Validierung von Modellierungssoftware als Alternative zu
+    ModelMaker 4.0</span>
+  author:
+  - family: Ranke
+    given: Johannes
+  type: report
+  issued:
+    year: 2014
+  number: "Umweltbundesamt Projektnummer 27452"
+vignette: >
+  %\VignetteIndexEntry{Example evaluation of FOCUS Laboratory Data L1 to L3}
+  %\VignetteEngine{knitr::rmarkdown}
+  %\VignetteEncoding{UTF-8}
+---
+
+```{r, include = FALSE}
+library(knitr)
+opts_chunk$set(tidy = FALSE, cache = FALSE)
+```
+
+# Laboratory Data L1
+
+The following code defines example dataset L1 from the FOCUS kinetics
+report, p. 284:
+
+```{r}
+library("mkin", quietly = TRUE)
+FOCUS_2006_L1 = data.frame(
+  t = rep(c(0, 1, 2, 3, 5, 7, 14, 21, 30), each = 2),
+  parent = c(88.3, 91.4, 85.6, 84.5, 78.9, 77.6,
+             72.0, 71.9, 50.3, 59.4, 47.0, 45.1,
+             27.7, 27.3, 10.0, 10.4, 2.9, 4.0))
+FOCUS_2006_L1_mkin <- mkin_wide_to_long(FOCUS_2006_L1)
+```
+
+Here we use the assumptions of simple first order (SFO), the case of declining
+rate constant over time (FOMC) and the case of two different phases of the
+kinetics (DFOP). For a more detailed discussion of the models, please see the
+FOCUS kinetics report.
+
+Since mkin version 0.9-32 (July 2014), we can use shorthand notation like `"SFO"`
+for parent only degradation models. The following two lines fit the model and
+produce the summary report of the model fit. This covers the numerical analysis
+given in the FOCUS report.
+
+```{r}
+m.L1.SFO <- mkinfit("SFO", FOCUS_2006_L1_mkin, quiet = TRUE)
+summary(m.L1.SFO)
+```
+
+A plot of the fit is obtained with the plot function for mkinfit objects.
+
+```{r fig.width = 6, fig.height = 5}
+plot(m.L1.SFO, show_errmin = TRUE, main = "FOCUS L1 - SFO")
+```
+
+The residual plot can be easily obtained by
+
+```{r fig.width = 6, fig.height = 5}
+mkinresplot(m.L1.SFO, ylab = "Observed", xlab = "Time")
+```
+
+For comparison, the FOMC model is fitted as well, and the $\chi^2$ error level
+is checked.
+
+```{r fig.width = 6, fig.height = 5}
+m.L1.FOMC <- mkinfit("FOMC", FOCUS_2006_L1_mkin, quiet=TRUE)
+plot(m.L1.FOMC, show_errmin = TRUE, main = "FOCUS L1 - FOMC")
+summary(m.L1.FOMC, data = FALSE)
+```
+
+We get a warning that the default optimisation algorithm `Port` did not converge, which
+is an indication that the model is overparameterised, *i.e.* contains too many
+parameters that are ill-defined as a consequence.
+
+And in fact, due to the higher number of parameters, and the lower number of
+degrees of freedom of the fit, the $\chi^2$ error level is actually higher for
+the FOMC model (3.6%) than for the SFO model (3.4%). Additionally, the
+parameters `log_alpha` and `log_beta` internally fitted in the model have
+excessive confidence intervals, that span more than 25 orders of magnitude (!)
+when backtransformed to the scale of `alpha` and `beta`. Also, the t-test
+for significant difference from zero does not indicate such a significant difference,
+with p-values greater than 0.1, and finally, the parameter correlation of `log_alpha`
+and `log_beta` is 1.000, clearly indicating that the model is overparameterised.
+
+The $\chi^2$ error levels reported in Appendix 3 and Appendix 7 to the FOCUS
+kinetics report are rounded to integer percentages and partly deviate by one
+percentage point from the results calculated by mkin. The reason for
+this is not known. However, mkin gives the same $\chi^2$ error levels
+as the kinfit package and the calculation routines of the kinfit package have
+been extensively compared to the results obtained by the KinGUI
+software, as documented in the kinfit package vignette. KinGUI was the first
+widely used standard package in this field. Also, the calculation of
+$\chi^2$ error levels was compared with KinGUII, CAKE and DegKin manager in
+a project sponsored by the German Umweltbundesamt [@ranke2014].
+
+# Laboratory Data L2
+
+The following code defines example dataset L2 from the FOCUS kinetics
+report, p. 287:
+
+```{r}
+FOCUS_2006_L2 = data.frame(
+  t = rep(c(0, 1, 3, 7, 14, 28), each = 2),
+  parent = c(96.1, 91.8, 41.4, 38.7,
+             19.3, 22.3, 4.6, 4.6,
+             2.6, 1.2, 0.3, 0.6))
+FOCUS_2006_L2_mkin <- mkin_wide_to_long(FOCUS_2006_L2)
+```
+
+## SFO fit for L2
+
+Again, the SFO model is fitted and the result is plotted. The residual plot
+can be obtained simply by adding the argument `show_residuals` to the plot
+command.
+
+```{r fig.width = 7, fig.height = 6}
+m.L2.SFO <- mkinfit("SFO", FOCUS_2006_L2_mkin, quiet=TRUE)
+plot(m.L2.SFO, show_residuals = TRUE, show_errmin = TRUE,
+     main = "FOCUS L2 - SFO")
+```
+
+The $\chi^2$ error level of 14% suggests that the model does not fit very well.
+This is also obvious from the plots of the fit, in which we have included
+the residual plot.
+
+In the FOCUS kinetics report, it is stated that there is no apparent systematic
+error observed from the residual plot up to the measured DT90 (approximately at
+day 5), and there is an underestimation beyond that point.
+
+We may add that it is difficult to judge the random nature of the residuals just
+from the three samplings at days 0, 1 and 3. Also, it is not clear _a
+priori_ why a consistent underestimation after the approximate DT90 should be
+irrelevant. However, this can be rationalised by the fact that the FOCUS fate
+models generally only implement SFO kinetics.
+
+## FOMC fit for L2
+
+For comparison, the FOMC model is fitted as well, and the $\chi^2$ error level
+is checked.
+
+```{r fig.width = 7, fig.height = 6}
+m.L2.FOMC <- mkinfit("FOMC", FOCUS_2006_L2_mkin, quiet = TRUE)
+plot(m.L2.FOMC, show_residuals = TRUE,
+     main = "FOCUS L2 - FOMC")
+summary(m.L2.FOMC, data = FALSE)
+```
+
+The error level at which the $\chi^2$ test passes is much lower in this case.
+Therefore, the FOMC model provides a better description of the data, as less
+experimental error has to be assumed in order to explain the data.
+
+## DFOP fit for L2
+
+Fitting the four parameter DFOP model further reduces the $\chi^2$ error level.
+
+```{r fig.width = 7, fig.height = 6}
+m.L2.DFOP <- mkinfit("DFOP", FOCUS_2006_L2_mkin, quiet = TRUE)
+plot(m.L2.DFOP, show_residuals = TRUE, show_errmin = TRUE,
+     main = "FOCUS L2 - DFOP")
+summary(m.L2.DFOP, data = FALSE)
+```
+
+Here, the DFOP model is clearly the best-fit model for dataset L2 based on the
+chi^2 error level criterion. However, the failure to calculate the covariance
+matrix indicates that the parameter estimates correlate excessively. Therefore,
+the FOMC model may be preferred for this dataset.
+
+# Laboratory Data L3
+
+The following code defines example dataset L3 from the FOCUS kinetics report,
+p. 290.
+
+```{r}
+FOCUS_2006_L3 = data.frame(
+  t = c(0, 3, 7, 14, 30, 60, 91, 120),
+  parent = c(97.8, 60, 51, 43, 35, 22, 15, 12))
+FOCUS_2006_L3_mkin <- mkin_wide_to_long(FOCUS_2006_L3)
+```
+
+## Fit multiple models
+
+As of mkin version 0.9-39 (June 2015), we can fit several models to
+one or more datasets in one call to the function `mmkin`. The datasets
+have to be passed in a list, in this case a named list holding only
+the L3 dataset prepared above.
+
+```{r fig.height = 8}
+# Only use one core here, not to offend the CRAN checks
+mm.L3 <- mmkin(c("SFO", "FOMC", "DFOP"), cores = 1,
+               list("FOCUS L3" = FOCUS_2006_L3_mkin), quiet = TRUE)
+plot(mm.L3)
+```
+
+The $\chi^2$ error level of 21% as well as the plot suggest that the SFO model
+does not fit very well.  The FOMC model performs better, with an
+error level at which the $\chi^2$ test passes of 7%.  Fitting the four
+parameter DFOP model further reduces the $\chi^2$ error level
+considerably.
+
+## Accessing mmkin objects
+
+The objects returned by mmkin are arranged like a matrix, with
+models as a row index and datasets as a column index.
+
+We can extract the summary and plot for *e.g.* the DFOP fit,
+using square brackets for indexing which will result in the use of
+the summary and plot functions working on mkinfit objects.
+
+```{r fig.height = 5}
+summary(mm.L3[["DFOP", 1]])
+plot(mm.L3[["DFOP", 1]], show_errmin = TRUE)
+```
+
+Here, a look to the model plot, the confidence intervals of the parameters
+and the correlation matrix suggest that the parameter estimates are reliable, and
+the DFOP model can be used as the best-fit model based on the $\chi^2$ error
+level criterion for laboratory data L3.
+
+This is also an example where the standard t-test for the parameter `g_ilr` is
+misleading, as it tests for a significant difference from zero. In this case,
+zero appears to be the correct value for this parameter, and the confidence
+interval for the backtransformed parameter `g` is quite narrow.
+
+# Laboratory Data L4
+
+The following code defines example dataset L4 from the FOCUS kinetics
+report, p. 293:
+
+```{r}
+FOCUS_2006_L4 = data.frame(
+  t = c(0, 3, 7, 14, 30, 60, 91, 120),
+  parent = c(96.6, 96.3, 94.3, 88.8, 74.9, 59.9, 53.5, 49.0))
+FOCUS_2006_L4_mkin <- mkin_wide_to_long(FOCUS_2006_L4)
+```
+
+Fits of the SFO and FOMC models, plots and summaries are produced below:
+
+```{r fig.height = 6}
+# Only use one core here, not to offend the CRAN checks
+mm.L4 <- mmkin(c("SFO", "FOMC"), cores = 1,
+               list("FOCUS L4" = FOCUS_2006_L4_mkin),
+               quiet = TRUE)
+plot(mm.L4)
+```
+
+The $\chi^2$ error level of 3.3% as well as the plot suggest that the SFO model
+fits very well.  The error level at which the $\chi^2$ test passes is slightly
+lower for the FOMC model. However, the difference appears negligible.
+
+
+```{r fig.height = 8}
+summary(mm.L4[["SFO", 1]], data = FALSE)
+summary(mm.L4[["FOMC", 1]], data = FALSE)
+```
+
+
+# References
author	Johannes Ranke <jranke@uni-bremen.de>	2020-05-13 16:20:23 +0200
committer	Johannes Ranke <jranke@uni-bremen.de>	2020-05-13 16:20:23 +0200
commit	218a9c55bd80fb708b15fa7196422f759bfe4b27 (patch)
tree	ad4b2aa4b561b3118d1ca8ee5e6b34fbd2dfcfe8 /vignettes/FOCUS_L.rmd
parent	36bc31c52cbe4b686f5562e21ee110380481dff8 (diff)