diff options
Diffstat (limited to 'vignettes/FOCUS_L.rmd')
-rw-r--r-- | vignettes/FOCUS_L.rmd | 271 |
1 files changed, 271 insertions, 0 deletions
diff --git a/vignettes/FOCUS_L.rmd b/vignettes/FOCUS_L.rmd new file mode 100644 index 00000000..e017b674 --- /dev/null +++ b/vignettes/FOCUS_L.rmd @@ -0,0 +1,271 @@ +--- +title: "Example evaluation of FOCUS Laboratory Data L1 to L3" +author: "Johannes Ranke" +date: "`r Sys.Date()`" +output: + html_document: + toc: true + toc_float: + collapsed: false + mathjax: null + fig_retina: null +references: +- id: ranke2014 + title: <span class="nocase">Prüfung und Validierung von Modellierungssoftware als Alternative zu + ModelMaker 4.0</span> + author: + - family: Ranke + given: Johannes + type: report + issued: + year: 2014 + number: "Umweltbundesamt Projektnummer 27452" +vignette: > + %\VignetteIndexEntry{Example evaluation of FOCUS Laboratory Data L1 to L3} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r, include = FALSE} +library(knitr) +opts_chunk$set(tidy = FALSE, cache = FALSE) +``` + +# Laboratory Data L1 + +The following code defines example dataset L1 from the FOCUS kinetics +report, p. 284: + +```{r} +library("mkin", quietly = TRUE) +FOCUS_2006_L1 = data.frame( + t = rep(c(0, 1, 2, 3, 5, 7, 14, 21, 30), each = 2), + parent = c(88.3, 91.4, 85.6, 84.5, 78.9, 77.6, + 72.0, 71.9, 50.3, 59.4, 47.0, 45.1, + 27.7, 27.3, 10.0, 10.4, 2.9, 4.0)) +FOCUS_2006_L1_mkin <- mkin_wide_to_long(FOCUS_2006_L1) +``` + +Here we use the assumptions of simple first order (SFO), the case of declining +rate constant over time (FOMC) and the case of two different phases of the +kinetics (DFOP). For a more detailed discussion of the models, please see the +FOCUS kinetics report. + +Since mkin version 0.9-32 (July 2014), we can use shorthand notation like `"SFO"` +for parent only degradation models. The following two lines fit the model and +produce the summary report of the model fit. This covers the numerical analysis +given in the FOCUS report. + +```{r} +m.L1.SFO <- mkinfit("SFO", FOCUS_2006_L1_mkin, quiet = TRUE) +summary(m.L1.SFO) +``` + +A plot of the fit is obtained with the plot function for mkinfit objects. + +```{r fig.width = 6, fig.height = 5} +plot(m.L1.SFO, show_errmin = TRUE, main = "FOCUS L1 - SFO") +``` + +The residual plot can be easily obtained by + +```{r fig.width = 6, fig.height = 5} +mkinresplot(m.L1.SFO, ylab = "Observed", xlab = "Time") +``` + +For comparison, the FOMC model is fitted as well, and the $\chi^2$ error level +is checked. + +```{r fig.width = 6, fig.height = 5} +m.L1.FOMC <- mkinfit("FOMC", FOCUS_2006_L1_mkin, quiet=TRUE) +plot(m.L1.FOMC, show_errmin = TRUE, main = "FOCUS L1 - FOMC") +summary(m.L1.FOMC, data = FALSE) +``` + +We get a warning that the default optimisation algorithm `Port` did not converge, which +is an indication that the model is overparameterised, *i.e.* contains too many +parameters that are ill-defined as a consequence. + +And in fact, due to the higher number of parameters, and the lower number of +degrees of freedom of the fit, the $\chi^2$ error level is actually higher for +the FOMC model (3.6%) than for the SFO model (3.4%). Additionally, the +parameters `log_alpha` and `log_beta` internally fitted in the model have +excessive confidence intervals, that span more than 25 orders of magnitude (!) +when backtransformed to the scale of `alpha` and `beta`. Also, the t-test +for significant difference from zero does not indicate such a significant difference, +with p-values greater than 0.1, and finally, the parameter correlation of `log_alpha` +and `log_beta` is 1.000, clearly indicating that the model is overparameterised. + +The $\chi^2$ error levels reported in Appendix 3 and Appendix 7 to the FOCUS +kinetics report are rounded to integer percentages and partly deviate by one +percentage point from the results calculated by mkin. The reason for +this is not known. However, mkin gives the same $\chi^2$ error levels +as the kinfit package and the calculation routines of the kinfit package have +been extensively compared to the results obtained by the KinGUI +software, as documented in the kinfit package vignette. KinGUI was the first +widely used standard package in this field. Also, the calculation of +$\chi^2$ error levels was compared with KinGUII, CAKE and DegKin manager in +a project sponsored by the German Umweltbundesamt [@ranke2014]. + +# Laboratory Data L2 + +The following code defines example dataset L2 from the FOCUS kinetics +report, p. 287: + +```{r} +FOCUS_2006_L2 = data.frame( + t = rep(c(0, 1, 3, 7, 14, 28), each = 2), + parent = c(96.1, 91.8, 41.4, 38.7, + 19.3, 22.3, 4.6, 4.6, + 2.6, 1.2, 0.3, 0.6)) +FOCUS_2006_L2_mkin <- mkin_wide_to_long(FOCUS_2006_L2) +``` + +## SFO fit for L2 + +Again, the SFO model is fitted and the result is plotted. The residual plot +can be obtained simply by adding the argument `show_residuals` to the plot +command. + +```{r fig.width = 7, fig.height = 6} +m.L2.SFO <- mkinfit("SFO", FOCUS_2006_L2_mkin, quiet=TRUE) +plot(m.L2.SFO, show_residuals = TRUE, show_errmin = TRUE, + main = "FOCUS L2 - SFO") +``` + +The $\chi^2$ error level of 14% suggests that the model does not fit very well. +This is also obvious from the plots of the fit, in which we have included +the residual plot. + +In the FOCUS kinetics report, it is stated that there is no apparent systematic +error observed from the residual plot up to the measured DT90 (approximately at +day 5), and there is an underestimation beyond that point. + +We may add that it is difficult to judge the random nature of the residuals just +from the three samplings at days 0, 1 and 3. Also, it is not clear _a +priori_ why a consistent underestimation after the approximate DT90 should be +irrelevant. However, this can be rationalised by the fact that the FOCUS fate +models generally only implement SFO kinetics. + +## FOMC fit for L2 + +For comparison, the FOMC model is fitted as well, and the $\chi^2$ error level +is checked. + +```{r fig.width = 7, fig.height = 6} +m.L2.FOMC <- mkinfit("FOMC", FOCUS_2006_L2_mkin, quiet = TRUE) +plot(m.L2.FOMC, show_residuals = TRUE, + main = "FOCUS L2 - FOMC") +summary(m.L2.FOMC, data = FALSE) +``` + +The error level at which the $\chi^2$ test passes is much lower in this case. +Therefore, the FOMC model provides a better description of the data, as less +experimental error has to be assumed in order to explain the data. + +## DFOP fit for L2 + +Fitting the four parameter DFOP model further reduces the $\chi^2$ error level. + +```{r fig.width = 7, fig.height = 6} +m.L2.DFOP <- mkinfit("DFOP", FOCUS_2006_L2_mkin, quiet = TRUE) +plot(m.L2.DFOP, show_residuals = TRUE, show_errmin = TRUE, + main = "FOCUS L2 - DFOP") +summary(m.L2.DFOP, data = FALSE) +``` + +Here, the DFOP model is clearly the best-fit model for dataset L2 based on the +chi^2 error level criterion. However, the failure to calculate the covariance +matrix indicates that the parameter estimates correlate excessively. Therefore, +the FOMC model may be preferred for this dataset. + +# Laboratory Data L3 + +The following code defines example dataset L3 from the FOCUS kinetics report, +p. 290. + +```{r} +FOCUS_2006_L3 = data.frame( + t = c(0, 3, 7, 14, 30, 60, 91, 120), + parent = c(97.8, 60, 51, 43, 35, 22, 15, 12)) +FOCUS_2006_L3_mkin <- mkin_wide_to_long(FOCUS_2006_L3) +``` + +## Fit multiple models + +As of mkin version 0.9-39 (June 2015), we can fit several models to +one or more datasets in one call to the function `mmkin`. The datasets +have to be passed in a list, in this case a named list holding only +the L3 dataset prepared above. + +```{r fig.height = 8} +# Only use one core here, not to offend the CRAN checks +mm.L3 <- mmkin(c("SFO", "FOMC", "DFOP"), cores = 1, + list("FOCUS L3" = FOCUS_2006_L3_mkin), quiet = TRUE) +plot(mm.L3) +``` + +The $\chi^2$ error level of 21% as well as the plot suggest that the SFO model +does not fit very well. The FOMC model performs better, with an +error level at which the $\chi^2$ test passes of 7%. Fitting the four +parameter DFOP model further reduces the $\chi^2$ error level +considerably. + +## Accessing mmkin objects + +The objects returned by mmkin are arranged like a matrix, with +models as a row index and datasets as a column index. + +We can extract the summary and plot for *e.g.* the DFOP fit, +using square brackets for indexing which will result in the use of +the summary and plot functions working on mkinfit objects. + +```{r fig.height = 5} +summary(mm.L3[["DFOP", 1]]) +plot(mm.L3[["DFOP", 1]], show_errmin = TRUE) +``` + +Here, a look to the model plot, the confidence intervals of the parameters +and the correlation matrix suggest that the parameter estimates are reliable, and +the DFOP model can be used as the best-fit model based on the $\chi^2$ error +level criterion for laboratory data L3. + +This is also an example where the standard t-test for the parameter `g_ilr` is +misleading, as it tests for a significant difference from zero. In this case, +zero appears to be the correct value for this parameter, and the confidence +interval for the backtransformed parameter `g` is quite narrow. + +# Laboratory Data L4 + +The following code defines example dataset L4 from the FOCUS kinetics +report, p. 293: + +```{r} +FOCUS_2006_L4 = data.frame( + t = c(0, 3, 7, 14, 30, 60, 91, 120), + parent = c(96.6, 96.3, 94.3, 88.8, 74.9, 59.9, 53.5, 49.0)) +FOCUS_2006_L4_mkin <- mkin_wide_to_long(FOCUS_2006_L4) +``` + +Fits of the SFO and FOMC models, plots and summaries are produced below: + +```{r fig.height = 6} +# Only use one core here, not to offend the CRAN checks +mm.L4 <- mmkin(c("SFO", "FOMC"), cores = 1, + list("FOCUS L4" = FOCUS_2006_L4_mkin), + quiet = TRUE) +plot(mm.L4) +``` + +The $\chi^2$ error level of 3.3% as well as the plot suggest that the SFO model +fits very well. The error level at which the $\chi^2$ test passes is slightly +lower for the FOMC model. However, the difference appears negligible. + + +```{r fig.height = 8} +summary(mm.L4[["SFO", 1]], data = FALSE) +summary(mm.L4[["FOMC", 1]], data = FALSE) +``` + + +# References |