From 4a918da6d5f971335b74b0fc83cb08f5c3163f95 Mon Sep 17 00:00:00 2001 From: Johannes Ranke Date: Fri, 21 Jul 2017 14:42:14 +0200 Subject: Rename twa to max_twa_parent, update docs --- docs/articles/FOCUS_D.Rmd | 74 +++++ docs/articles/FOCUS_D.html | 254 ++++++++++------ docs/articles/FOCUS_L.Rmd | 271 +++++++++++++++++ docs/articles/FOCUS_L.html | 592 +++++++++++++++++++++++-------------- docs/articles/FOCUS_Z.R | 17 ++ docs/articles/FOCUS_Z.pdf | Bin 239559 -> 240270 bytes docs/articles/compiled_models.R | 64 ++-- docs/articles/compiled_models.Rmd | 107 +++++++ docs/articles/compiled_models.html | 452 ++++++++++++++++++---------- docs/articles/mkin.Rmd | 223 ++++++++++++++ docs/articles/mkin.html | 438 +++++++++++++++++++-------- docs/articles/twa.R | 1 + docs/articles/twa.Rmd | 57 ++++ docs/reference/index.html | 38 ++- docs/reference/max_twa_parent.html | 179 +++++++++++ 15 files changed, 2125 insertions(+), 642 deletions(-) create mode 100644 docs/articles/FOCUS_D.Rmd create mode 100644 docs/articles/FOCUS_L.Rmd create mode 100644 docs/articles/compiled_models.Rmd create mode 100644 docs/articles/mkin.Rmd create mode 100644 docs/articles/twa.R create mode 100644 docs/articles/twa.Rmd create mode 100644 docs/reference/max_twa_parent.html (limited to 'docs') diff --git a/docs/articles/FOCUS_D.Rmd b/docs/articles/FOCUS_D.Rmd new file mode 100644 index 00000000..b4e61a7c --- /dev/null +++ b/docs/articles/FOCUS_D.Rmd @@ -0,0 +1,74 @@ +--- +title: Example evaluation of FOCUS Example Dataset D +author: Johannes Ranke +date: "`r Sys.Date()`" +output: + html_document: + mathjax: null + theme: united + fig_retina: null +vignette: > + %\VignetteIndexEntry{Example evaluation of FOCUS Example Dataset D} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r, include = FALSE} +library(knitr) +opts_chunk$set(tidy = FALSE, cache = TRUE) +``` + +This is just a very simple vignette showing how to fit a degradation model for a parent +compound with one transformation product using `mkin`. After loading the +library we look a the data. We have observed concentrations in the column named +`value` at the times specified in column `time` for the two observed variables +named `parent` and `m1`. + + +```{r data} +library("mkin") +print(FOCUS_2006_D) +``` + +Next we specify the degradation model: The parent compound degrades with simple first-order +kinetics (SFO) to one metabolite named m1, which also degrades with SFO kinetics. + +The call to mkinmod returns a degradation model. The differential equations represented in +R code can be found in the character vector `$diffs` of the `mkinmod` object. If +a C compiler (gcc) is installed and functional, the differential equation model will +be compiled from auto-generated C code. + +```{r model} +SFO_SFO <- mkinmod(parent = mkinsub("SFO", "m1"), m1 = mkinsub("SFO")) +print(SFO_SFO$diffs) +``` + +We do the fitting without progress report (`quiet = TRUE`). + + +```{r fit} +fit <- mkinfit(SFO_SFO, FOCUS_2006_D, quiet = TRUE) +``` + +A plot of the fit including a residual plot for both observed variables is obtained +using the `plot_sep` method for `mkinfit` objects, which shows separate graphs for +all compounds and their residuals. + +```{r plot, fig.height = 6, fig.width = 8} +plot_sep(fit, lpos = c("topright", "bottomright")) +``` + +Confidence intervals for the parameter estimates are obtained using the `mkinparplot` function. + + +```{r plot_2, fig.height = 4, fig.width = 8} +mkinparplot(fit) +``` + +A comprehensive report of the results is obtained using the `summary` method for `mkinfit` +objects. + + +```{r} +summary(fit) +``` diff --git a/docs/articles/FOCUS_D.html b/docs/articles/FOCUS_D.html index 50b3d00a..5da77efa 100644 --- a/docs/articles/FOCUS_D.html +++ b/docs/articles/FOCUS_D.html @@ -1,71 +1,137 @@ -Example evaluation of FOCUS Example Dataset D • mkin -
-
- - -
-
- - - - -
+ + + + + + + + + + + + + + +Example evaluation of FOCUS Example Dataset D + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + +

This is just a very simple vignette showing how to fit a degradation model for a parent compound with one transformation product using mkin. After loading the library we look a the data. We have observed concentrations in the column named value at the times specified in column time for the two observed variables named parent and m1.

-
library("mkin")
-print(FOCUS_2006_D)
+
library("mkin")
+
## Loading required package: minpack.lm
+
## Loading required package: rootSolve
+
## Loading required package: inline
+
## Loading required package: methods
+
## Loading required package: parallel
+
print(FOCUS_2006_D)
##      name time  value
 ## 1  parent    0  99.46
 ## 2  parent    0 102.04
@@ -113,27 +179,27 @@
 ## 44     m1  120  33.31

Next we specify the degradation model: The parent compound degrades with simple first-order kinetics (SFO) to one metabolite named m1, which also degrades with SFO kinetics.

The call to mkinmod returns a degradation model. The differential equations represented in R code can be found in the character vector $diffs of the mkinmod object. If a C compiler (gcc) is installed and functional, the differential equation model will be compiled from auto-generated C code.

-
SFO_SFO <- mkinmod(parent = mkinsub("SFO", "m1"), m1 = mkinsub("SFO"))
+
SFO_SFO <- mkinmod(parent = mkinsub("SFO", "m1"), m1 = mkinsub("SFO"))
## Successfully compiled differential equation model from auto-generated C code.
-
print(SFO_SFO$diffs)
+
print(SFO_SFO$diffs)
##                                                       parent 
-## "d_parent = - k_parent_sink * parent - k_parent_m1 * parent" 
+## "d_parent = - k_parent_sink * parent - k_parent_m1 * parent" 
 ##                                                           m1 
-##             "d_m1 = + k_parent_m1 * parent - k_m1_sink * m1"
+## "d_m1 = + k_parent_m1 * parent - k_m1_sink * m1"

We do the fitting without progress report (quiet = TRUE).

-
fit <- mkinfit(SFO_SFO, FOCUS_2006_D, quiet = TRUE)
+
fit <- mkinfit(SFO_SFO, FOCUS_2006_D, quiet = TRUE)

A plot of the fit including a residual plot for both observed variables is obtained using the plot_sep method for mkinfit objects, which shows separate graphs for all compounds and their residuals.

-
plot_sep(fit, lpos = c("topright", "bottomright"))
-

+
plot_sep(fit, lpos = c("topright", "bottomright"))
+

Confidence intervals for the parameter estimates are obtained using the mkinparplot function.

- -

+
mkinparplot(fit)
+

A comprehensive report of the results is obtained using the summary method for mkinfit objects.

-
summary(fit)
+
summary(fit)
## mkin version:    0.9.45 
-## R version:       3.3.2 
-## Date of fit:     Thu Dec  8 09:39:23 2016 
-## Date of summary: Thu Dec  8 09:39:24 2016 
+## R version:       3.4.0 
+## Date of fit:     Fri May  5 12:14:00 2017 
+## Date of summary: Fri May  5 12:14:00 2017 
 ## 
 ## Equations:
 ## d_parent/dt = - k_parent_sink * parent - k_parent_m1 * parent
@@ -141,7 +207,7 @@
 ## 
 ## Model predictions using solution type deSolve 
 ## 
-## Fitted with method Port using 153 model solutions performed in 0.627 s
+## Fitted with method Port using 153 model solutions performed in 1.033 s
 ## 
 ## Weighting: none
 ## 
@@ -175,7 +241,7 @@
 ## parent_0           1.00000            0.6075        -0.06625       -0.1701
 ## log_k_parent_sink  0.60752            1.0000        -0.08740       -0.6253
 ## log_k_parent_m1   -0.06625           -0.0874         1.00000        0.4716
-## log_k_m1_sink     -0.17006           -0.6253         0.47163        1.0000
+## log_k_m1_sink     -0.17006           -0.6253         0.47164        1.0000
 ## 
 ## Residual standard error: 3.211 on 36 degrees of freedom
 ## 
@@ -252,23 +318,25 @@
 ##   100       m1    33.13 3.198e+01  1.148e+00
 ##   120       m1    25.15 2.879e+01 -3.640e+00
 ##   120       m1    33.31 2.879e+01  4.520e+00
-
-
- -
-
-
-

Site built with pkgdown.

-
+ -
- + + diff --git a/docs/articles/FOCUS_L.Rmd b/docs/articles/FOCUS_L.Rmd new file mode 100644 index 00000000..e017b674 --- /dev/null +++ b/docs/articles/FOCUS_L.Rmd @@ -0,0 +1,271 @@ +--- +title: "Example evaluation of FOCUS Laboratory Data L1 to L3" +author: "Johannes Ranke" +date: "`r Sys.Date()`" +output: + html_document: + toc: true + toc_float: + collapsed: false + mathjax: null + fig_retina: null +references: +- id: ranke2014 + title: Prüfung und Validierung von Modellierungssoftware als Alternative zu + ModelMaker 4.0 + author: + - family: Ranke + given: Johannes + type: report + issued: + year: 2014 + number: "Umweltbundesamt Projektnummer 27452" +vignette: > + %\VignetteIndexEntry{Example evaluation of FOCUS Laboratory Data L1 to L3} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r, include = FALSE} +library(knitr) +opts_chunk$set(tidy = FALSE, cache = FALSE) +``` + +# Laboratory Data L1 + +The following code defines example dataset L1 from the FOCUS kinetics +report, p. 284: + +```{r} +library("mkin", quietly = TRUE) +FOCUS_2006_L1 = data.frame( + t = rep(c(0, 1, 2, 3, 5, 7, 14, 21, 30), each = 2), + parent = c(88.3, 91.4, 85.6, 84.5, 78.9, 77.6, + 72.0, 71.9, 50.3, 59.4, 47.0, 45.1, + 27.7, 27.3, 10.0, 10.4, 2.9, 4.0)) +FOCUS_2006_L1_mkin <- mkin_wide_to_long(FOCUS_2006_L1) +``` + +Here we use the assumptions of simple first order (SFO), the case of declining +rate constant over time (FOMC) and the case of two different phases of the +kinetics (DFOP). For a more detailed discussion of the models, please see the +FOCUS kinetics report. + +Since mkin version 0.9-32 (July 2014), we can use shorthand notation like `"SFO"` +for parent only degradation models. The following two lines fit the model and +produce the summary report of the model fit. This covers the numerical analysis +given in the FOCUS report. + +```{r} +m.L1.SFO <- mkinfit("SFO", FOCUS_2006_L1_mkin, quiet = TRUE) +summary(m.L1.SFO) +``` + +A plot of the fit is obtained with the plot function for mkinfit objects. + +```{r fig.width = 6, fig.height = 5} +plot(m.L1.SFO, show_errmin = TRUE, main = "FOCUS L1 - SFO") +``` + +The residual plot can be easily obtained by + +```{r fig.width = 6, fig.height = 5} +mkinresplot(m.L1.SFO, ylab = "Observed", xlab = "Time") +``` + +For comparison, the FOMC model is fitted as well, and the $\chi^2$ error level +is checked. + +```{r fig.width = 6, fig.height = 5} +m.L1.FOMC <- mkinfit("FOMC", FOCUS_2006_L1_mkin, quiet=TRUE) +plot(m.L1.FOMC, show_errmin = TRUE, main = "FOCUS L1 - FOMC") +summary(m.L1.FOMC, data = FALSE) +``` + +We get a warning that the default optimisation algorithm `Port` did not converge, which +is an indication that the model is overparameterised, *i.e.* contains too many +parameters that are ill-defined as a consequence. + +And in fact, due to the higher number of parameters, and the lower number of +degrees of freedom of the fit, the $\chi^2$ error level is actually higher for +the FOMC model (3.6%) than for the SFO model (3.4%). Additionally, the +parameters `log_alpha` and `log_beta` internally fitted in the model have +excessive confidence intervals, that span more than 25 orders of magnitude (!) +when backtransformed to the scale of `alpha` and `beta`. Also, the t-test +for significant difference from zero does not indicate such a significant difference, +with p-values greater than 0.1, and finally, the parameter correlation of `log_alpha` +and `log_beta` is 1.000, clearly indicating that the model is overparameterised. + +The $\chi^2$ error levels reported in Appendix 3 and Appendix 7 to the FOCUS +kinetics report are rounded to integer percentages and partly deviate by one +percentage point from the results calculated by mkin. The reason for +this is not known. However, mkin gives the same $\chi^2$ error levels +as the kinfit package and the calculation routines of the kinfit package have +been extensively compared to the results obtained by the KinGUI +software, as documented in the kinfit package vignette. KinGUI was the first +widely used standard package in this field. Also, the calculation of +$\chi^2$ error levels was compared with KinGUII, CAKE and DegKin manager in +a project sponsored by the German Umweltbundesamt [@ranke2014]. + +# Laboratory Data L2 + +The following code defines example dataset L2 from the FOCUS kinetics +report, p. 287: + +```{r} +FOCUS_2006_L2 = data.frame( + t = rep(c(0, 1, 3, 7, 14, 28), each = 2), + parent = c(96.1, 91.8, 41.4, 38.7, + 19.3, 22.3, 4.6, 4.6, + 2.6, 1.2, 0.3, 0.6)) +FOCUS_2006_L2_mkin <- mkin_wide_to_long(FOCUS_2006_L2) +``` + +## SFO fit for L2 + +Again, the SFO model is fitted and the result is plotted. The residual plot +can be obtained simply by adding the argument `show_residuals` to the plot +command. + +```{r fig.width = 7, fig.height = 6} +m.L2.SFO <- mkinfit("SFO", FOCUS_2006_L2_mkin, quiet=TRUE) +plot(m.L2.SFO, show_residuals = TRUE, show_errmin = TRUE, + main = "FOCUS L2 - SFO") +``` + +The $\chi^2$ error level of 14% suggests that the model does not fit very well. +This is also obvious from the plots of the fit, in which we have included +the residual plot. + +In the FOCUS kinetics report, it is stated that there is no apparent systematic +error observed from the residual plot up to the measured DT90 (approximately at +day 5), and there is an underestimation beyond that point. + +We may add that it is difficult to judge the random nature of the residuals just +from the three samplings at days 0, 1 and 3. Also, it is not clear _a +priori_ why a consistent underestimation after the approximate DT90 should be +irrelevant. However, this can be rationalised by the fact that the FOCUS fate +models generally only implement SFO kinetics. + +## FOMC fit for L2 + +For comparison, the FOMC model is fitted as well, and the $\chi^2$ error level +is checked. + +```{r fig.width = 7, fig.height = 6} +m.L2.FOMC <- mkinfit("FOMC", FOCUS_2006_L2_mkin, quiet = TRUE) +plot(m.L2.FOMC, show_residuals = TRUE, + main = "FOCUS L2 - FOMC") +summary(m.L2.FOMC, data = FALSE) +``` + +The error level at which the $\chi^2$ test passes is much lower in this case. +Therefore, the FOMC model provides a better description of the data, as less +experimental error has to be assumed in order to explain the data. + +## DFOP fit for L2 + +Fitting the four parameter DFOP model further reduces the $\chi^2$ error level. + +```{r fig.width = 7, fig.height = 6} +m.L2.DFOP <- mkinfit("DFOP", FOCUS_2006_L2_mkin, quiet = TRUE) +plot(m.L2.DFOP, show_residuals = TRUE, show_errmin = TRUE, + main = "FOCUS L2 - DFOP") +summary(m.L2.DFOP, data = FALSE) +``` + +Here, the DFOP model is clearly the best-fit model for dataset L2 based on the +chi^2 error level criterion. However, the failure to calculate the covariance +matrix indicates that the parameter estimates correlate excessively. Therefore, +the FOMC model may be preferred for this dataset. + +# Laboratory Data L3 + +The following code defines example dataset L3 from the FOCUS kinetics report, +p. 290. + +```{r} +FOCUS_2006_L3 = data.frame( + t = c(0, 3, 7, 14, 30, 60, 91, 120), + parent = c(97.8, 60, 51, 43, 35, 22, 15, 12)) +FOCUS_2006_L3_mkin <- mkin_wide_to_long(FOCUS_2006_L3) +``` + +## Fit multiple models + +As of mkin version 0.9-39 (June 2015), we can fit several models to +one or more datasets in one call to the function `mmkin`. The datasets +have to be passed in a list, in this case a named list holding only +the L3 dataset prepared above. + +```{r fig.height = 8} +# Only use one core here, not to offend the CRAN checks +mm.L3 <- mmkin(c("SFO", "FOMC", "DFOP"), cores = 1, + list("FOCUS L3" = FOCUS_2006_L3_mkin), quiet = TRUE) +plot(mm.L3) +``` + +The $\chi^2$ error level of 21% as well as the plot suggest that the SFO model +does not fit very well. The FOMC model performs better, with an +error level at which the $\chi^2$ test passes of 7%. Fitting the four +parameter DFOP model further reduces the $\chi^2$ error level +considerably. + +## Accessing mmkin objects + +The objects returned by mmkin are arranged like a matrix, with +models as a row index and datasets as a column index. + +We can extract the summary and plot for *e.g.* the DFOP fit, +using square brackets for indexing which will result in the use of +the summary and plot functions working on mkinfit objects. + +```{r fig.height = 5} +summary(mm.L3[["DFOP", 1]]) +plot(mm.L3[["DFOP", 1]], show_errmin = TRUE) +``` + +Here, a look to the model plot, the confidence intervals of the parameters +and the correlation matrix suggest that the parameter estimates are reliable, and +the DFOP model can be used as the best-fit model based on the $\chi^2$ error +level criterion for laboratory data L3. + +This is also an example where the standard t-test for the parameter `g_ilr` is +misleading, as it tests for a significant difference from zero. In this case, +zero appears to be the correct value for this parameter, and the confidence +interval for the backtransformed parameter `g` is quite narrow. + +# Laboratory Data L4 + +The following code defines example dataset L4 from the FOCUS kinetics +report, p. 293: + +```{r} +FOCUS_2006_L4 = data.frame( + t = c(0, 3, 7, 14, 30, 60, 91, 120), + parent = c(96.6, 96.3, 94.3, 88.8, 74.9, 59.9, 53.5, 49.0)) +FOCUS_2006_L4_mkin <- mkin_wide_to_long(FOCUS_2006_L4) +``` + +Fits of the SFO and FOMC models, plots and summaries are produced below: + +```{r fig.height = 6} +# Only use one core here, not to offend the CRAN checks +mm.L4 <- mmkin(c("SFO", "FOMC"), cores = 1, + list("FOCUS L4" = FOCUS_2006_L4_mkin), + quiet = TRUE) +plot(mm.L4) +``` + +The $\chi^2$ error level of 3.3% as well as the plot suggest that the SFO model +fits very well. The error level at which the $\chi^2$ test passes is slightly +lower for the FOMC model. However, the difference appears negligible. + + +```{r fig.height = 8} +summary(mm.L4[["SFO", 1]], data = FALSE) +summary(mm.L4[["FOMC", 1]], data = FALSE) +``` + + +# References diff --git a/docs/articles/FOCUS_L.html b/docs/articles/FOCUS_L.html index 366a0e46..fd531133 100644 --- a/docs/articles/FOCUS_L.html +++ b/docs/articles/FOCUS_L.html @@ -1,93 +1,258 @@ -Example evaluation of FOCUS Laboratory Data L1 to L3 • mkin -
-
- - -
-
- - - - -
+ + + + + + + + + + + + + + +Example evaluation of FOCUS Laboratory Data L1 to L3 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + +
+
+
+
+
+ +
+ + + + + + +
-

Laboratory Data L1

-

The following code defines example dataset L1 from the FOCUS kinetics report, p. 284:

-
library("mkin", quietly = TRUE)
-FOCUS_2006_L1 = data.frame(
-  t = rep(c(0, 1, 2, 3, 5, 7, 14, 21, 30), each = 2),
-  parent = c(88.3, 91.4, 85.6, 84.5, 78.9, 77.6,
-             72.0, 71.9, 50.3, 59.4, 47.0, 45.1,
-             27.7, 27.3, 10.0, 10.4, 2.9, 4.0))
-FOCUS_2006_L1_mkin <- mkin_wide_to_long(FOCUS_2006_L1)
+

Laboratory Data L1

+

The following code defines example dataset L1 from the FOCUS kinetics report, p. 284:

+
library("mkin", quietly = TRUE)
+FOCUS_2006_L1 = data.frame(
+  t = rep(c(0, 1, 2, 3, 5, 7, 14, 21, 30), each = 2),
+  parent = c(88.3, 91.4, 85.6, 84.5, 78.9, 77.6,
+             72.0, 71.9, 50.3, 59.4, 47.0, 45.1,
+             27.7, 27.3, 10.0, 10.4, 2.9, 4.0))
+FOCUS_2006_L1_mkin <- mkin_wide_to_long(FOCUS_2006_L1)

Here we use the assumptions of simple first order (SFO), the case of declining rate constant over time (FOMC) and the case of two different phases of the kinetics (DFOP). For a more detailed discussion of the models, please see the FOCUS kinetics report.

-

Since mkin version 0.9-32 (July 2014), we can use shorthand notation like "SFO" for parent only degradation models. The following two lines fit the model and produce the summary report of the model fit. This covers the numerical analysis given in the FOCUS report.

-
m.L1.SFO <- mkinfit("SFO", FOCUS_2006_L1_mkin, quiet = TRUE)
-summary(m.L1.SFO)
+

Since mkin version 0.9-32 (July 2014), we can use shorthand notation like "SFO" for parent only degradation models. The following two lines fit the model and produce the summary report of the model fit. This covers the numerical analysis given in the FOCUS report.

+
m.L1.SFO <- mkinfit("SFO", FOCUS_2006_L1_mkin, quiet = TRUE)
+summary(m.L1.SFO)
## mkin version:    0.9.45 
-## R version:       3.3.2 
-## Date of fit:     Thu Dec  8 09:39:24 2016 
-## Date of summary: Thu Dec  8 09:39:24 2016 
+## R version:       3.4.0 
+## Date of fit:     Fri May  5 12:14:02 2017 
+## Date of summary: Fri May  5 12:14:02 2017 
 ## 
 ## Equations:
 ## d_parent/dt = - k_parent_sink * parent
 ## 
 ## Model predictions using solution type analytical 
 ## 
-## Fitted with method Port using 37 model solutions performed in 0.084 s
+## Fitted with method Port using 37 model solutions performed in 0.263 s
 ## 
 ## Weighting: none
 ## 
@@ -158,22 +323,22 @@ FOCUS_2006_L1_mkin <- 
plot(m.L1.SFO, show_errmin = TRUE, main = "FOCUS L1 - SFO")
-

+
plot(m.L1.SFO, show_errmin = TRUE, main = "FOCUS L1 - SFO")
+

The residual plot can be easily obtained by

-
mkinresplot(m.L1.SFO, ylab = "Observed", xlab = "Time")
-

-

For comparison, the FOMC model is fitted as well, and the \(\chi^2\) error level is checked.

-
m.L1.FOMC <- mkinfit("FOMC", FOCUS_2006_L1_mkin, quiet=TRUE)
-
## Warning in mkinfit("FOMC", FOCUS_2006_L1_mkin, quiet = TRUE): Optimisation by method Port did not converge.
+
mkinresplot(m.L1.SFO, ylab = "Observed", xlab = "Time")
+

+

For comparison, the FOMC model is fitted as well, and the χ2 error level is checked.

+
m.L1.FOMC <- mkinfit("FOMC", FOCUS_2006_L1_mkin, quiet=TRUE)
+
## Warning in mkinfit("FOMC", FOCUS_2006_L1_mkin, quiet = TRUE): Optimisation by method Port did not converge.
 ## Convergence code is 1
-
plot(m.L1.FOMC, show_errmin = TRUE, main = "FOCUS L1 - FOMC")
-

-
summary(m.L1.FOMC, data = FALSE)
+
plot(m.L1.FOMC, show_errmin = TRUE, main = "FOCUS L1 - FOMC")
+

+
summary(m.L1.FOMC, data = FALSE)
## mkin version:    0.9.45 
-## R version:       3.3.2 
-## Date of fit:     Thu Dec  8 09:39:25 2016 
-## Date of summary: Thu Dec  8 09:39:25 2016 
+## R version:       3.4.0 
+## Date of fit:     Fri May  5 12:14:03 2017 
+## Date of summary: Fri May  5 12:14:03 2017 
 ## 
 ## 
 ## Warning: Optimisation by method Port did not converge.
@@ -185,7 +350,7 @@ FOCUS_2006_L1_mkin <-    \(\chi^2\) error level is actually higher for the FOMC model (3.6%) than for the SFO model (3.4%). Additionally, the parameters log_alpha and log_beta internally fitted in the model have excessive confidence intervals, that span more than 25 orders of magnitude (!) when backtransformed to the scale of alpha and beta. Also, the t-test for significant difference from zero does not indicate such a significant difference, with p-values greater than 0.1, and finally, the parameter correlation of log_alpha and log_beta is 1.000, clearly indicating that the model is overparameterised.

-

The \(\chi^2\) error levels reported in Appendix 3 and Appendix 7 to the FOCUS kinetics report are rounded to integer percentages and partly deviate by one percentage point from the results calculated by mkin. The reason for this is not known. However, mkin gives the same \(\chi^2\) error levels as the kinfit package and the calculation routines of the kinfit package have been extensively compared to the results obtained by the KinGUI software, as documented in the kinfit package vignette. KinGUI was the first widely used standard package in this field. Also, the calculation of \(\chi^2\) error levels was compared with KinGUII, CAKE and DegKin manager in a project sponsored by the German Umweltbundesamt (Ranke 2014).

+

And in fact, due to the higher number of parameters, and the lower number of degrees of freedom of the fit, the χ2 error level is actually higher for the FOMC model (3.6%) than for the SFO model (3.4%). Additionally, the parameters log_alpha and log_beta internally fitted in the model have excessive confidence intervals, that span more than 25 orders of magnitude (!) when backtransformed to the scale of alpha and beta. Also, the t-test for significant difference from zero does not indicate such a significant difference, with p-values greater than 0.1, and finally, the parameter correlation of log_alpha and log_beta is 1.000, clearly indicating that the model is overparameterised.

+

The χ2 error levels reported in Appendix 3 and Appendix 7 to the FOCUS kinetics report are rounded to integer percentages and partly deviate by one percentage point from the results calculated by mkin. The reason for this is not known. However, mkin gives the same χ2 error levels as the kinfit package and the calculation routines of the kinfit package have been extensively compared to the results obtained by the KinGUI software, as documented in the kinfit package vignette. KinGUI was the first widely used standard package in this field. Also, the calculation of χ2 error levels was compared with KinGUII, CAKE and DegKin manager in a project sponsored by the German Umweltbundesamt (Ranke 2014).

-

Laboratory Data L2

-

The following code defines example dataset L2 from the FOCUS kinetics report, p. 287:

-
FOCUS_2006_L2 = data.frame(
-  t = rep(c(0, 1, 3, 7, 14, 28), each = 2),
-  parent = c(96.1, 91.8, 41.4, 38.7,
-             19.3, 22.3, 4.6, 4.6,
-             2.6, 1.2, 0.3, 0.6))
-FOCUS_2006_L2_mkin <- mkin_wide_to_long(FOCUS_2006_L2)
+

Laboratory Data L2

+

The following code defines example dataset L2 from the FOCUS kinetics report, p. 287:

+
FOCUS_2006_L2 = data.frame(
+  t = rep(c(0, 1, 3, 7, 14, 28), each = 2),
+  parent = c(96.1, 91.8, 41.4, 38.7,
+             19.3, 22.3, 4.6, 4.6,
+             2.6, 1.2, 0.3, 0.6))
+FOCUS_2006_L2_mkin <- mkin_wide_to_long(FOCUS_2006_L2)
-

SFO fit for L2

+

SFO fit for L2

Again, the SFO model is fitted and the result is plotted. The residual plot can be obtained simply by adding the argument show_residuals to the plot command.

-
m.L2.SFO <- mkinfit("SFO", FOCUS_2006_L2_mkin, quiet=TRUE)
-plot(m.L2.SFO, show_residuals = TRUE, show_errmin = TRUE,
-     main = "FOCUS L2 - SFO")
-

-

The \(\chi^2\) error level of 14% suggests that the model does not fit very well. This is also obvious from the plots of the fit, in which we have included the residual plot.

+
m.L2.SFO <- mkinfit("SFO", FOCUS_2006_L2_mkin, quiet=TRUE)
+plot(m.L2.SFO, show_residuals = TRUE, show_errmin = TRUE,
+     main = "FOCUS L2 - SFO")
+

+

The χ2 error level of 14% suggests that the model does not fit very well. This is also obvious from the plots of the fit, in which we have included the residual plot.

In the FOCUS kinetics report, it is stated that there is no apparent systematic error observed from the residual plot up to the measured DT90 (approximately at day 5), and there is an underestimation beyond that point.

We may add that it is difficult to judge the random nature of the residuals just from the three samplings at days 0, 1 and 3. Also, it is not clear a priori why a consistent underestimation after the approximate DT90 should be irrelevant. However, this can be rationalised by the fact that the FOCUS fate models generally only implement SFO kinetics.

-

FOMC fit for L2

-

For comparison, the FOMC model is fitted as well, and the \(\chi^2\) error level is checked.

-
m.L2.FOMC <- mkinfit("FOMC", FOCUS_2006_L2_mkin, quiet = TRUE)
-plot(m.L2.FOMC, show_residuals = TRUE,
-     main = "FOCUS L2 - FOMC")
-

-
summary(m.L2.FOMC, data = FALSE)
+

FOMC fit for L2

+

For comparison, the FOMC model is fitted as well, and the χ2 error level is checked.

+
m.L2.FOMC <- mkinfit("FOMC", FOCUS_2006_L2_mkin, quiet = TRUE)
+plot(m.L2.FOMC, show_residuals = TRUE,
+     main = "FOCUS L2 - FOMC")
+

+
summary(m.L2.FOMC, data = FALSE)
## mkin version:    0.9.45 
-## R version:       3.3.2 
-## Date of fit:     Thu Dec  8 09:39:26 2016 
-## Date of summary: Thu Dec  8 09:39:26 2016 
+## R version:       3.4.0 
+## Date of fit:     Fri May  5 12:14:04 2017 
+## Date of summary: Fri May  5 12:14:04 2017 
 ## 
 ## Equations:
 ## d_parent/dt = - (alpha/beta) * 1/((time/beta) + 1) * parent
 ## 
 ## Model predictions using solution type analytical 
 ## 
-## Fitted with method Port using 81 model solutions performed in 0.183 s
+## Fitted with method Port using 81 model solutions performed in 0.171 s
 ## 
 ## Weighting: none
 ## 
@@ -327,20 +492,20 @@ FOCUS_2006_L2_mkin <- \(\chi^2\) test passes is much lower in this case. Therefore, the FOMC model provides a better description of the data, as less experimental error has to be assumed in order to explain the data.

+

The error level at which the χ2 test passes is much lower in this case. Therefore, the FOMC model provides a better description of the data, as less experimental error has to be assumed in order to explain the data.

-

DFOP fit for L2

-

Fitting the four parameter DFOP model further reduces the \(\chi^2\) error level.

-
m.L2.DFOP <- mkinfit("DFOP", FOCUS_2006_L2_mkin, quiet = TRUE)
-plot(m.L2.DFOP, show_residuals = TRUE, show_errmin = TRUE,
-     main = "FOCUS L2 - DFOP")
-

-
summary(m.L2.DFOP, data = FALSE)
+

DFOP fit for L2

+

Fitting the four parameter DFOP model further reduces the χ2 error level.

+
m.L2.DFOP <- mkinfit("DFOP", FOCUS_2006_L2_mkin, quiet = TRUE)
+plot(m.L2.DFOP, show_residuals = TRUE, show_errmin = TRUE,
+     main = "FOCUS L2 - DFOP")
+

+
summary(m.L2.DFOP, data = FALSE)
## mkin version:    0.9.45 
-## R version:       3.3.2 
-## Date of fit:     Thu Dec  8 09:39:27 2016 
-## Date of summary: Thu Dec  8 09:39:27 2016 
+## R version:       3.4.0 
+## Date of fit:     Fri May  5 12:14:05 2017 
+## Date of summary: Fri May  5 12:14:05 2017 
 ## 
 ## Equations:
 ## d_parent/dt = - ((k1 * g * exp(-k1 * time) + k2 * (1 - g) *
@@ -349,7 +514,7 @@ FOCUS_2006_L2_mkin <-  
-

Laboratory Data L3

-

The following code defines example dataset L3 from the FOCUS kinetics report, p. 290.

-
FOCUS_2006_L3 = data.frame(
-  t = c(0, 3, 7, 14, 30, 60, 91, 120),
-  parent = c(97.8, 60, 51, 43, 35, 22, 15, 12))
-FOCUS_2006_L3_mkin <- mkin_wide_to_long(FOCUS_2006_L3)
+

Laboratory Data L3

+

The following code defines example dataset L3 from the FOCUS kinetics report, p. 290.

+
FOCUS_2006_L3 = data.frame(
+  t = c(0, 3, 7, 14, 30, 60, 91, 120),
+  parent = c(97.8, 60, 51, 43, 35, 22, 15, 12))
+FOCUS_2006_L3_mkin <- mkin_wide_to_long(FOCUS_2006_L3)
-

Fit multiple models

+

Fit multiple models

As of mkin version 0.9-39 (June 2015), we can fit several models to one or more datasets in one call to the function mmkin. The datasets have to be passed in a list, in this case a named list holding only the L3 dataset prepared above.

-
# Only use one core here, not to offend the CRAN checks
-mm.L3 <- mmkin(c("SFO", "FOMC", "DFOP"), cores = 1,
-               list("FOCUS L3" = FOCUS_2006_L3_mkin), quiet = TRUE)
-plot(mm.L3)
-

-

The \(\chi^2\) error level of 21% as well as the plot suggest that the SFO model does not fit very well. The FOMC model performs better, with an error level at which the \(\chi^2\) test passes of 7%. Fitting the four parameter DFOP model further reduces the \(\chi^2\) error level considerably.

+
# Only use one core here, not to offend the CRAN checks
+mm.L3 <- mmkin(c("SFO", "FOMC", "DFOP"), cores = 1,
+               list("FOCUS L3" = FOCUS_2006_L3_mkin), quiet = TRUE)
+plot(mm.L3)
+

+

The χ2 error level of 21% as well as the plot suggest that the SFO model does not fit very well. The FOMC model performs better, with an error level at which the χ2 test passes of 7%. Fitting the four parameter DFOP model further reduces the χ2 error level considerably.

-

Accessing mmkin objects

+

Accessing mmkin objects

The objects returned by mmkin are arranged like a matrix, with models as a row index and datasets as a column index.

We can extract the summary and plot for e.g. the DFOP fit, using square brackets for indexing which will result in the use of the summary and plot functions working on mkinfit objects.

-
summary(mm.L3[["DFOP", 1]])
+
summary(mm.L3[["DFOP", 1]])
## mkin version:    0.9.45 
-## R version:       3.3.2 
-## Date of fit:     Thu Dec  8 09:39:28 2016 
-## Date of summary: Thu Dec  8 09:39:28 2016 
+## R version:       3.4.0 
+## Date of fit:     Fri May  5 12:14:06 2017 
+## Date of summary: Fri May  5 12:14:06 2017 
 ## 
 ## Equations:
 ## d_parent/dt = - ((k1 * g * exp(-k1 * time) + k2 * (1 - g) *
@@ -438,7 +603,7 @@ mm.L3 <-  
plot(mm.L3[["DFOP", 1]], show_errmin = TRUE)
-

-

Here, a look to the model plot, the confidence intervals of the parameters and the correlation matrix suggest that the parameter estimates are reliable, and the DFOP model can be used as the best-fit model based on the \(\chi^2\) error level criterion for laboratory data L3.

+
plot(mm.L3[["DFOP", 1]], show_errmin = TRUE)
+

+

Here, a look to the model plot, the confidence intervals of the parameters and the correlation matrix suggest that the parameter estimates are reliable, and the DFOP model can be used as the best-fit model based on the χ2 error level criterion for laboratory data L3.

This is also an example where the standard t-test for the parameter g_ilr is misleading, as it tests for a significant difference from zero. In this case, zero appears to be the correct value for this parameter, and the confidence interval for the backtransformed parameter g is quite narrow.

-

Laboratory Data L4

-

The following code defines example dataset L4 from the FOCUS kinetics report, p. 293:

-
FOCUS_2006_L4 = data.frame(
-  t = c(0, 3, 7, 14, 30, 60, 91, 120),
-  parent = c(96.6, 96.3, 94.3, 88.8, 74.9, 59.9, 53.5, 49.0))
-FOCUS_2006_L4_mkin <- mkin_wide_to_long(FOCUS_2006_L4)
+

Laboratory Data L4

+

The following code defines example dataset L4 from the FOCUS kinetics report, p. 293:

+
FOCUS_2006_L4 = data.frame(
+  t = c(0, 3, 7, 14, 30, 60, 91, 120),
+  parent = c(96.6, 96.3, 94.3, 88.8, 74.9, 59.9, 53.5, 49.0))
+FOCUS_2006_L4_mkin <- mkin_wide_to_long(FOCUS_2006_L4)

Fits of the SFO and FOMC models, plots and summaries are produced below:

-
# Only use one core here, not to offend the CRAN checks
-mm.L4 <- mmkin(c("SFO", "FOMC"), cores = 1,
-               list("FOCUS L4" = FOCUS_2006_L4_mkin),
-               quiet = TRUE)
-plot(mm.L4)
-

-

The \(\chi^2\) error level of 3.3% as well as the plot suggest that the SFO model fits very well. The error level at which the \(\chi^2\) test passes is slightly lower for the FOMC model. However, the difference appears negligible.

-
summary(mm.L4[["SFO", 1]], data = FALSE)
+
# Only use one core here, not to offend the CRAN checks
+mm.L4 <- mmkin(c("SFO", "FOMC"), cores = 1,
+               list("FOCUS L4" = FOCUS_2006_L4_mkin),
+               quiet = TRUE)
+plot(mm.L4)
+

+

The χ2 error level of 3.3% as well as the plot suggest that the SFO model fits very well. The error level at which the χ2 test passes is slightly lower for the FOMC model. However, the difference appears negligible.

+
summary(mm.L4[["SFO", 1]], data = FALSE)
## mkin version:    0.9.45 
-## R version:       3.3.2 
-## Date of fit:     Thu Dec  8 09:39:28 2016 
-## Date of summary: Thu Dec  8 09:39:29 2016 
+## R version:       3.4.0 
+## Date of fit:     Fri May  5 12:14:06 2017 
+## Date of summary: Fri May  5 12:14:06 2017 
 ## 
 ## Equations:
 ## d_parent/dt = - k_parent_sink * parent
 ## 
 ## Model predictions using solution type analytical 
 ## 
-## Fitted with method Port using 46 model solutions performed in 0.107 s
+## Fitted with method Port using 46 model solutions performed in 0.096 s
 ## 
 ## Weighting: none
 ## 
@@ -585,18 +750,18 @@ mm.L4 <- 
summary(mm.L4[["FOMC", 1]], data = FALSE)
+
summary(mm.L4[["FOMC", 1]], data = FALSE)
## mkin version:    0.9.45 
-## R version:       3.3.2 
-## Date of fit:     Thu Dec  8 09:39:28 2016 
-## Date of summary: Thu Dec  8 09:39:29 2016 
+## R version:       3.4.0 
+## Date of fit:     Fri May  5 12:14:06 2017 
+## Date of summary: Fri May  5 12:14:06 2017 
 ## 
 ## Equations:
 ## d_parent/dt = - (alpha/beta) * 1/((time/beta) + 1) * parent
 ## 
 ## Model predictions using solution type analytical 
 ## 
-## Fitted with method Port using 66 model solutions performed in 0.145 s
+## Fitted with method Port using 66 model solutions performed in 0.138 s
 ## 
 ## Weighting: none
 ## 
@@ -648,43 +813,34 @@ mm.L4 <- 
-

References

+

References

-

Ranke, Johannes. 2014. “Prüfung und Validierung von Modellierungssoftware als Alternative zu ModelMaker 4.0.” Umweltbundesamt Projektnummer 27452.

+

Ranke, Johannes. 2014. “Prüfung und Validierung von Modellierungssoftware als Alternative zu ModelMaker 4.0.” Umweltbundesamt Projektnummer 27452.

-
-
- - -
-
-
-

Site built with pkgdown.

-
+ + - + + diff --git a/docs/articles/FOCUS_Z.R b/docs/articles/FOCUS_Z.R index 0d5f0940..5c70b57e 100644 --- a/docs/articles/FOCUS_Z.R +++ b/docs/articles/FOCUS_Z.R @@ -3,6 +3,23 @@ require(knitr) opts_chunk$set(engine='R', tidy = FALSE, cache = TRUE) options(width=70) +## ----FOCUS_2006_Z_data, echo=TRUE, eval=TRUE------------------------ +require(mkin) +LOD = 0.5 +FOCUS_2006_Z = data.frame( + t = c(0, 0.04, 0.125, 0.29, 0.54, 1, 2, 3, 4, 7, 10, 14, 21, + 42, 61, 96, 124), + Z0 = c(100, 81.7, 70.4, 51.1, 41.2, 6.6, 4.6, 3.9, 4.6, 4.3, 6.8, + 2.9, 3.5, 5.3, 4.4, 1.2, 0.7), + Z1 = c(0, 18.3, 29.6, 46.3, 55.1, 65.7, 39.1, 36, 15.3, 5.6, 1.1, + 1.6, 0.6, 0.5 * LOD, NA, NA, NA), + Z2 = c(0, NA, 0.5 * LOD, 2.6, 3.8, 15.3, 37.2, 31.7, 35.6, 14.5, + 0.8, 2.1, 1.9, 0.5 * LOD, NA, NA, NA), + Z3 = c(0, NA, NA, NA, NA, 0.5 * LOD, 9.2, 13.1, 22.3, 28.4, 32.5, + 25.2, 17.2, 4.8, 4.5, 2.8, 4.4)) + +FOCUS_2006_Z_mkin <- mkin_wide_to_long(FOCUS_2006_Z) + ## ----FOCUS_2006_Z_fits_1, echo=TRUE, fig.height=6------------------- Z.2a <- mkinmod(Z0 = mkinsub("SFO", "Z1"), Z1 = mkinsub("SFO")) diff --git a/docs/articles/FOCUS_Z.pdf b/docs/articles/FOCUS_Z.pdf index fd313555..8a87e874 100644 Binary files a/docs/articles/FOCUS_Z.pdf and b/docs/articles/FOCUS_Z.pdf differ diff --git a/docs/articles/compiled_models.R b/docs/articles/compiled_models.R index dd450e0d..25c39dac 100644 --- a/docs/articles/compiled_models.R +++ b/docs/articles/compiled_models.R @@ -12,40 +12,40 @@ SFO_SFO <- mkinmod( m1 = mkinsub("SFO")) ## ----benchmark_SFO_SFO, fig.height = 3----------------------------------- -library("microbenchmark") -library("ggplot2") -mb.1 <- microbenchmark( - "deSolve, not compiled" = mkinfit(SFO_SFO, FOCUS_2006_D, - solution_type = "deSolve", - use_compiled = FALSE, quiet = TRUE), - "Eigenvalue based" = mkinfit(SFO_SFO, FOCUS_2006_D, - solution_type = "eigen", quiet = TRUE), - "deSolve, compiled" = mkinfit(SFO_SFO, FOCUS_2006_D, - solution_type = "deSolve", quiet = TRUE), - times = 3, control = list(warmup = 0)) - -smb.1 <- summary(mb.1) -print(mb.1) -autoplot(mb.1) - -## ------------------------------------------------------------------------ -rownames(smb.1) <- smb.1$expr -smb.1["median"]/smb.1["deSolve, compiled", "median"] +if (require(rbenchmark)) { + b.1 <- benchmark( + "deSolve, not compiled" = mkinfit(SFO_SFO, FOCUS_2006_D, + solution_type = "deSolve", + use_compiled = FALSE, quiet = TRUE), + "Eigenvalue based" = mkinfit(SFO_SFO, FOCUS_2006_D, + solution_type = "eigen", quiet = TRUE), + "deSolve, compiled" = mkinfit(SFO_SFO, FOCUS_2006_D, + solution_type = "deSolve", quiet = TRUE), + replications = 3) + print(b.1) + factor_SFO_SFO <- round(b.1["1", "relative"]) +} else { + factor_SFO_SFO <- NA + print("R package benchmark is not available") +} ## ----benchmark_FOMC_SFO, fig.height = 3---------------------------------- -FOMC_SFO <- mkinmod( - parent = mkinsub("FOMC", "m1"), - m1 = mkinsub( "SFO")) - -mb.2 <- microbenchmark( - "deSolve, not compiled" = mkinfit(FOMC_SFO, FOCUS_2006_D, - use_compiled = FALSE, quiet = TRUE), - "deSolve, compiled" = mkinfit(FOMC_SFO, FOCUS_2006_D, quiet = TRUE), - times = 3, control = list(warmup = 0)) -smb.2 <- summary(mb.2) -print(mb.2) -smb.2["median"]/smb.2["deSolve, compiled", "median"] -autoplot(mb.2) +if (require(rbenchmark)) { + FOMC_SFO <- mkinmod( + parent = mkinsub("FOMC", "m1"), + m1 = mkinsub( "SFO")) + + b.2 <- benchmark( + "deSolve, not compiled" = mkinfit(FOMC_SFO, FOCUS_2006_D, + use_compiled = FALSE, quiet = TRUE), + "deSolve, compiled" = mkinfit(FOMC_SFO, FOCUS_2006_D, quiet = TRUE), + replications = 3) + print(b.2) + factor_FOMC_SFO <- round(b.2["1", "relative"]) +} else { + factor_FOMC_SFO <- NA + print("R package benchmark is not available") +} ## ----sessionInfo, echo = FALSE------------------------------------------- cat(capture.output(sessionInfo())[1:3], sep = "\n") diff --git a/docs/articles/compiled_models.Rmd b/docs/articles/compiled_models.Rmd new file mode 100644 index 00000000..8b5d731a --- /dev/null +++ b/docs/articles/compiled_models.Rmd @@ -0,0 +1,107 @@ +--- +title: "Performance benefit by using compiled model definitions in mkin" +author: "Johannes Ranke" +date: "`r Sys.Date()`" +output: + html_document: + theme: united + toc: true + toc_float: true + mathjax: null +vignette: > + %\VignetteIndexEntry{Performance benefit by using compiled model definitions in mkin} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- + +```{r, include = FALSE} +library(knitr) +opts_chunk$set(tidy = FALSE, cache = FALSE) +``` + +## Model that can also be solved with Eigenvalues + +This evaluation is taken from the example section of mkinfit. When using an mkin version +equal to or greater than 0.9-36 and a C compiler (gcc) is available, you will see +a message that the model is being compiled from autogenerated C code when +defining a model using mkinmod. The `mkinmod()` function checks for presence of +the gcc compiler using + +```{r check_gcc} +Sys.which("gcc") +``` +First, we build a simple degradation model for a parent compound with one metabolite. + +```{r create_SFO_SFO} +library("mkin") +SFO_SFO <- mkinmod( + parent = mkinsub("SFO", "m1"), + m1 = mkinsub("SFO")) +``` + +We can compare the performance of the Eigenvalue based solution against the +compiled version and the R implementation of the differential equations using +the benchmark package. + + +```{r benchmark_SFO_SFO, fig.height = 3} +if (require(rbenchmark)) { + b.1 <- benchmark( + "deSolve, not compiled" = mkinfit(SFO_SFO, FOCUS_2006_D, + solution_type = "deSolve", + use_compiled = FALSE, quiet = TRUE), + "Eigenvalue based" = mkinfit(SFO_SFO, FOCUS_2006_D, + solution_type = "eigen", quiet = TRUE), + "deSolve, compiled" = mkinfit(SFO_SFO, FOCUS_2006_D, + solution_type = "deSolve", quiet = TRUE), + replications = 3) + print(b.1) + factor_SFO_SFO <- round(b.1["1", "relative"]) +} else { + factor_SFO_SFO <- NA + print("R package benchmark is not available") +} +``` + +We see that using the compiled model is by a factor of around +`r factor_SFO_SFO` +faster than using the R version with the default ode solver, and it is even +faster than the Eigenvalue based solution implemented in R which does not need +iterative solution of the ODEs. + + +## Model that can not be solved with Eigenvalues + +This evaluation is also taken from the example section of mkinfit. + +```{r benchmark_FOMC_SFO, fig.height = 3} +if (require(rbenchmark)) { + FOMC_SFO <- mkinmod( + parent = mkinsub("FOMC", "m1"), + m1 = mkinsub( "SFO")) + + b.2 <- benchmark( + "deSolve, not compiled" = mkinfit(FOMC_SFO, FOCUS_2006_D, + use_compiled = FALSE, quiet = TRUE), + "deSolve, compiled" = mkinfit(FOMC_SFO, FOCUS_2006_D, quiet = TRUE), + replications = 3) + print(b.2) + factor_FOMC_SFO <- round(b.2["1", "relative"]) +} else { + factor_FOMC_SFO <- NA + print("R package benchmark is not available") +} +``` + +Here we get a performance benefit of a factor of +`r factor_FOMC_SFO` +using the version of the differential equation model compiled from C code! + +This vignette was built with mkin `r packageVersion("mkin")` on + +```{r sessionInfo, echo = FALSE} +cat(capture.output(sessionInfo())[1:3], sep = "\n") +if(!inherits(try(cpuinfo <- readLines("/proc/cpuinfo")), "try-error")) { + cat(gsub("model name\t: ", "CPU model: ", cpuinfo[grep("model name", cpuinfo)[1]])) +} +``` diff --git a/docs/articles/compiled_models.html b/docs/articles/compiled_models.html index 457f5a1d..67d0f658 100644 --- a/docs/articles/compiled_models.html +++ b/docs/articles/compiled_models.html @@ -1,180 +1,326 @@ -Performance benefit by using compiled model definitions in mkin • mkin -
-
- - -
-
- - - - -
+ + + + + + + + + + + + + + +Performance benefit by using compiled model definitions in mkin + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + +
+
+
+
+
+ +
+ + + + + + +
-

Model that can also be solved with Eigenvalues

-

This evaluation is taken from the example section of mkinfit. When using an mkin version equal to or greater than 0.9-36 and a C compiler (gcc) is available, you will see a message that the model is being compiled from autogenerated C code when defining a model using mkinmod. The mkinmod() function checks for presence of the gcc compiler using

-
Sys.which("gcc")
+

Model that can also be solved with Eigenvalues

+

This evaluation is taken from the example section of mkinfit. When using an mkin version equal to or greater than 0.9-36 and a C compiler (gcc) is available, you will see a message that the model is being compiled from autogenerated C code when defining a model using mkinmod. The mkinmod() function checks for presence of the gcc compiler using

+
Sys.which("gcc")
##            gcc 
-## "/usr/bin/gcc"
+## "/usr/bin/gcc"

First, we build a simple degradation model for a parent compound with one metabolite.

-
library("mkin")
+
library("mkin")
## Loading required package: minpack.lm
## Loading required package: rootSolve
## Loading required package: inline
## Loading required package: methods
## Loading required package: parallel
-
SFO_SFO <- mkinmod(
-  parent = mkinsub("SFO", "m1"),
-  m1 = mkinsub("SFO"))
+
SFO_SFO <- mkinmod(
+  parent = mkinsub("SFO", "m1"),
+  m1 = mkinsub("SFO"))
## Successfully compiled differential equation model from auto-generated C code.
-

We can compare the performance of the Eigenvalue based solution against the compiled version and the R implementation of the differential equations using the microbenchmark package.

-
library("microbenchmark")
-library("ggplot2")
-mb.1 <- microbenchmark(
-  "deSolve, not compiled" = mkinfit(SFO_SFO, FOCUS_2006_D,
-                                    solution_type = "deSolve",
-                                    use_compiled = FALSE, quiet = TRUE),
-  "Eigenvalue based" = mkinfit(SFO_SFO, FOCUS_2006_D,
-                               solution_type = "eigen", quiet = TRUE),
-  "deSolve, compiled" = mkinfit(SFO_SFO, FOCUS_2006_D,
-                                solution_type = "deSolve", quiet = TRUE),
-  times = 3, control = list(warmup = 0))
-
## Warning in microbenchmark(`deSolve, not compiled` = mkinfit(SFO_SFO,
-## FOCUS_2006_D, : Could not measure overhead. Your clock might lack
-## precision.
-
smb.1 <- summary(mb.1)
-print(mb.1)
-
## Unit: milliseconds
-##                   expr       min        lq      mean    median        uq
-##  deSolve, not compiled 5185.0893 5231.5690 5266.8769 5278.0487 5307.7706
-##       Eigenvalue based  843.3153  847.1503  876.5398  850.9853  893.1520
-##      deSolve, compiled  723.0636  740.5682  755.9995  758.0729  772.4674
-##        max neval cld
-##  5337.4926     3   b
-##   935.3187     3  a 
-##   786.8620     3  a
-
autoplot(mb.1)
-

-

We see that using the compiled model is by a factor of 7 faster than using the R version with the default ode solver, and it is even faster than the Eigenvalue based solution implemented in R which does not need iterative solution of the ODEs:

-
rownames(smb.1) <- smb.1$expr
-smb.1["median"]/smb.1["deSolve, compiled", "median"]
-
##                         median
-## deSolve, not compiled 6.962456
-## Eigenvalue based      1.122564
-## deSolve, compiled     1.000000
+

We can compare the performance of the Eigenvalue based solution against the compiled version and the R implementation of the differential equations using the benchmark package.

+
if (require(rbenchmark)) {
+  b.1 <- benchmark(
+    "deSolve, not compiled" = mkinfit(SFO_SFO, FOCUS_2006_D,
+                                      solution_type = "deSolve",
+                                      use_compiled = FALSE, quiet = TRUE),
+    "Eigenvalue based" = mkinfit(SFO_SFO, FOCUS_2006_D,
+                                 solution_type = "eigen", quiet = TRUE),
+    "deSolve, compiled" = mkinfit(SFO_SFO, FOCUS_2006_D,
+                                  solution_type = "deSolve", quiet = TRUE),
+    replications = 3)
+  print(b.1)
+  factor_SFO_SFO <- round(b.1["1", "relative"])
+} else {
+  factor_SFO_SFO <- NA
+  print("R package benchmark is not available")
+}
+
## Loading required package: rbenchmark
+
##                    test replications elapsed relative user.self sys.self
+## 3     deSolve, compiled            3   2.040    1.000     2.040        0
+## 1 deSolve, not compiled            3  14.622    7.168    14.624        0
+## 2      Eigenvalue based            3   2.478    1.215     2.480        0
+##   user.child sys.child
+## 3          0         0
+## 1          0         0
+## 2          0         0
+

We see that using the compiled model is by a factor of around 7 faster than using the R version with the default ode solver, and it is even faster than the Eigenvalue based solution implemented in R which does not need iterative solution of the ODEs.

-

Model that can not be solved with Eigenvalues

+

Model that can not be solved with Eigenvalues

This evaluation is also taken from the example section of mkinfit.

-
FOMC_SFO <- mkinmod(
-  parent = mkinsub("FOMC", "m1"),
-  m1 = mkinsub( "SFO"))
+
if (require(rbenchmark)) {
+  FOMC_SFO <- mkinmod(
+    parent = mkinsub("FOMC", "m1"),
+    m1 = mkinsub( "SFO"))
+
+  b.2 <- benchmark(
+    "deSolve, not compiled" = mkinfit(FOMC_SFO, FOCUS_2006_D,
+                                      use_compiled = FALSE, quiet = TRUE),
+    "deSolve, compiled" = mkinfit(FOMC_SFO, FOCUS_2006_D, quiet = TRUE),
+    replications = 3)
+  print(b.2)
+  factor_FOMC_SFO <- round(b.2["1", "relative"])
+} else {
+  factor_FOMC_SFO <- NA
+  print("R package benchmark is not available")
+}
## Successfully compiled differential equation model from auto-generated C code.
-
mb.2 <- microbenchmark(
-  "deSolve, not compiled" = mkinfit(FOMC_SFO, FOCUS_2006_D,
-                                    use_compiled = FALSE, quiet = TRUE),
-  "deSolve, compiled" = mkinfit(FOMC_SFO, FOCUS_2006_D, quiet = TRUE),
-  times = 3, control = list(warmup = 0))
-
## Warning in microbenchmark(`deSolve, not compiled` = mkinfit(FOMC_SFO,
-## FOCUS_2006_D, : Could not measure overhead. Your clock might lack
-## precision.
-
smb.2 <- summary(mb.2)
-print(mb.2)
-
## Unit: seconds
-##                   expr       min        lq      mean   median        uq
-##  deSolve, not compiled 10.963655 10.992677 11.033360 11.02170 11.068212
-##      deSolve, compiled  1.287898  1.309754  1.322972  1.33161  1.340509
-##        max neval cld
-##  11.114726     3   b
-##   1.349408     3  a
-
smb.2["median"]/smb.2["deSolve, compiled", "median"]
-
##   median
-## 1     NA
-## 2     NA
-
autoplot(mb.2)
-

-

Here we get a performance benefit of a factor of 8.3 using the version of the differential equation model compiled from C code!

+
##                    test replications elapsed relative user.self sys.self
+## 2     deSolve, compiled            3   3.500    1.000     3.500        0
+## 1 deSolve, not compiled            3  29.932    8.552    29.932        0
+##   user.child sys.child
+## 2          0         0
+## 1          0         0
+

Here we get a performance benefit of a factor of 9 using the version of the differential equation model compiled from C code!

This vignette was built with mkin 0.9.45 on

-
## R version 3.3.2 (2016-10-31)
+
## R version 3.4.0 (2017-04-21)
 ## Platform: x86_64-pc-linux-gnu (64-bit)
 ## Running under: Debian GNU/Linux 8 (jessie)
## CPU model: Intel(R) Core(TM) i7-4710MQ CPU @ 2.50GHz
-
-
- - -
-
-
-

Site built with pkgdown.

-
+ + - + + diff --git a/docs/articles/mkin.Rmd b/docs/articles/mkin.Rmd new file mode 100644 index 00000000..6e579ff1 --- /dev/null +++ b/docs/articles/mkin.Rmd @@ -0,0 +1,223 @@ +--- +title: Introduction to mkin +author: Johannes Ranke +date: "`r Sys.Date()`" +output: + html_document: + toc: true + toc_float: true + code_folding: hide + fig_retina: null +bibliography: references.bib +vignette: > + %\VignetteEngine{knitr::rmarkdown} + %\VignetteIndexEntry{mkin - Kinetic evaluation of chemical degradation data} + %\VignetteEncoding{UTF-8} +--- + +[Wissenschaftlicher Berater, Kronacher Str. 8, 79639 Grenzach-Wyhlen, Germany](http://www.jrwb.de)
+[Privatdozent at the University of Bremen](http://chem.uft.uni-bremen.de/ranke) + +```{r, include = FALSE} +require(knitr) +opts_chunk$set(engine='R', tidy=FALSE) +``` + +# Abstract +In the regulatory evaluation of chemical substances like plant protection +products (pesticides), biocides and other chemicals, degradation data play an +important role. For the evaluation of pesticide degradation experiments, +detailed guidance has been developed, based on nonlinear optimisation. +The `R` add-on package `mkin` [@pkg:mkin] implements fitting some of the models +recommended in this guidance from within R and calculates some statistical +measures for data series within one or more compartments, for parent and +metabolites. + +```{r, echo = TRUE, cache = TRUE, fig = TRUE, fig.width = 8, fig.height = 7} +library(mkin) +# Define the kinetic model +m_SFO_SFO_SFO <- mkinmod(parent = mkinsub("SFO", "M1"), + M1 = mkinsub("SFO", "M2"), + M2 = mkinsub("SFO"), + use_of_ff = "max", quiet = TRUE) + + +# Produce model predictions using some arbitrary parameters +sampling_times = c(0, 1, 3, 7, 14, 28, 60, 90, 120) +d_SFO_SFO_SFO <- mkinpredict(m_SFO_SFO_SFO, + c(k_parent = 0.03, + f_parent_to_M1 = 0.5, k_M1 = log(2)/100, + f_M1_to_M2 = 0.9, k_M2 = log(2)/50), + c(parent = 100, M1 = 0, M2 = 0), + sampling_times) + +# Generate a dataset by adding normally distributed errors with +# standard deviation 3, for two replicates at each sampling time +d_SFO_SFO_SFO_err <- add_err(d_SFO_SFO_SFO, reps = 2, + sdfunc = function(x) 3, + n = 1, seed = 123456789 ) + +# Fit the model to the dataset +f_SFO_SFO_SFO <- mkinfit(m_SFO_SFO_SFO, d_SFO_SFO_SFO_err[[1]], quiet = TRUE) + +# Plot the results separately for parent and metabolites +plot_sep(f_SFO_SFO_SFO, lpos = c("topright", "bottomright", "bottomright")) +``` + +# Background + +Many approaches are possible regarding the evaluation of chemical degradation +data. + +The now deprecated `kinfit` package [@pkg:kinfit] in `R` [@rcore2016] +implements the approach recommended in the kinetics report provided by the +FOrum for Co-ordination of pesticide fate models and their USe [@FOCUS2006; +-@FOCUSkinetics2014] for simple data series for one parent compound in one +compartment. + +The `mkin` package [@pkg:mkin] extends this approach to data series with +transformation products, commonly termed metabolites, and to more than one +compartment. It is also possible to include back reactions, so equilibrium +reactions and equilibrium partitioning can be specified, although this +oftentimes leads to an overparameterisation of the model. + +When the first `mkin` code was published in 2010, the most commonly used tools for +fitting more complex kinetic degradation models to experimental data were +KinGUI [@schaefer2007], a MATLAB based tool with a graphical user +interface that was specifically tailored to the task and included some output +as proposed by the FOCUS Kinetics Workgroup, and ModelMaker, a general purpose +compartment based tool providing infrastructure for fitting dynamic simulation +models based on differential equations to data. + +The code was first uploaded to the BerliOS platform. When this was taken down, +the version control history was imported into the R-Forge site, where the code +is still mirrored today (see *e.g.* +[the initial commit on 11 May 2010](http://cgit.jrwb.de/mkin/commit/?id=30cbb4092f6d2d3beff5800603374a0d009ad770)). + +At that time, the R package `FME` (Flexible Modelling Environment) +[@soetaert2010] was already available, and provided a good basis for +developing a package specifically tailored to the task. The remaining challenge +was to make it as easy as possible for the users (including the author of this +vignette) to specify the system of differential equations and to include the +output requested by the FOCUS guidance, such as the relative standard deviation +that has to be assumed for the residuals, such that the $\chi^2$ +goodness-of-fit test as defined by the FOCUS kinetics workgroup would pass +using an significance level $\alpha$ of 0.05. + +Also, `mkin` introduced using analytical solutions for parent only kinetics for +improved optimization speed. Later, Eigenvalue based solutions were +introduced to `mkin` for the case of linear differential equations (*i.e.* +where the FOMC or DFOP models were not used for the parent compound), greatly +improving the optimization speed for these cases. + +The possibility to specify back-reactions and a biphasic model (SFORB) for +metabolites were present in `mkin` from the very beginning. + +## Derived software tools + +Soon after the publication of `mkin`, two derived tools were published, namely +KinGUII (available from Bayer Crop Science) and CAKE (commissioned to Tessella +by Syngenta), which added a graphical user interface (GUI), and added fitting by +iteratively reweighted least squares (IRLS) and characterisation of likely +parameter distributions by Markov Chain Monte Carlo (MCMC) sampling. + +CAKE focuses on a smooth use experience, sacrificing some flexibility in the model +definition, originally allowing only two primary metabolites in parallel. +The current version 3.2 of CAKE release in March 2016 uses a basic scheme for +up to six metabolites in a flexible arrangement, but does not support +back-reactions (non-instantaneous equilibria) or biphasic kinetics for metabolites. + +KinGUI offers an even more flexible widget for specifying complex kinetic +models. Back-reactions (non-instanteneous equilibria) were supported early on, +but until 2014, only simple first-order models could be specified for +transformation products. Starting with KinGUII version 2.1, biphasic modelling +of metabolites was also available in KinGUII. + +A further graphical user interface (GUI) that has recently been brought to a decent +degree of maturity is the browser based GUI named `gmkin`. Please see its +[documentation page](http://kinfit.r-forge.r-project.org/gmkin_static) and +[manual](http://kinfit.r-forge.r-project.org/gmkin_static/vignettes/gmkin_manual.html) +for further information. + +## Recent developments + +Currently (June 2016), the main features available in `mkin` which are +not present in KinGUII or CAKE, are the speed increase by using compiled code when +a compiler is present, parallel model fitting on multicore machines using the +`mmkin` function, and the estimation of parameter confidence intervals based on +transformed parameters. These are explained in more detail below. + +# Internal parameter transformations + +For rate constants, the log +transformation is used, as proposed by Bates and Watts [-@bates1988, p. 77, +149]. Approximate intervals are constructed for the transformed rate +constants [compare @bates1988, p. 135], *i.e.* for their logarithms. +Confidence intervals for the rate constants are then obtained using the +appropriate backtransformation using the exponential function. + +In the first version of `mkin` allowing for specifying models using +formation fractions, a home-made reparameterisation was used in order to ensure +that the sum of formation fractions would not exceed unity. + +This method is still used in the current version of KinGUII (v2.1 from April +2014), with a modification that allows for fixing the pathway to sink to zero. +CAKE uses penalties in the objective function in order to enforce this +constraint. + +In 2012, an alternative reparameterisation of the formation fractions was +proposed together with René Lehmann [@ranke2012], based on isometric +logratio transformation (ILR). The aim was to improve the validity of the +linear approximation of the objective function during the parameter +estimation procedure as well as in the subsequent calculation of parameter +confidence intervals. + +## Confidence intervals based on transformed parameters + +In the first attempt at providing improved parameter confidence intervals +introduced to `mkin` in 2013, confidence intervals obtained from +FME on the transformed parameters were simply all backtransformed one by one +to yield asymetric confidence intervals for the backtransformed parameters. + +However, while there is a 1:1 relation between the rate constants in the model +and the transformed parameters fitted in the model, the parameters obtained by the +isometric logratio transformation are calculated from the set of formation +fractions that quantify the paths to each of the compounds formed from a +specific parent compound, and no such 1:1 relation exists. + +Therefore, parameter confidence intervals for formation fractions obtained with +this method only appear valid for the case of a single transformation product, where +only one formation fraction is to be estimated, directly corresponding to one +component of the ilr transformed parameter. + +The confidence intervals obtained by backtransformation for the cases where a +1:1 relation between transformed and original parameter exist are considered by +the author of this vignette to be more accurate than those obtained using a +re-estimation of the Hessian matrix after backtransformation, as implemented +in the FME package. + +## Parameter t-test based on untransformed parameters + +The standard output of many nonlinear regression software packages includes +the results from a test for significant difference from zero for all parameters. +Such a test is also recommended to check the validity of rate constants in the +FOCUS guidance [@FOCUSkinetics2014, p. 96ff]. + +It has been argued that the precondition for this test, *i.e.* normal distribution +of the estimator for the parameters, is not fulfilled in the case of nonlinear regression +[@ranke2015]. However, this test is commonly used by industry, consultants and +national authorities in order to decide on the reliability of parameter estimates, based +on the FOCUS guidance mentioned above. Therefore, the results of this one-sided +t-test are included in the summary output from `mkin`. + +As it is not reasonable to test for significant difference of the transformed +parameters (*e.g.* $log(k)$) from zero, the t-test is calculated based on the +model definition before parameter transformation, *i.e.* in a similar way as in +packages that do not apply such an internal parameter transformation. A note +is included in the `mkin` output, pointing to the fact that the t-test is based +on the unjustified assumption of normal distribution of the parameter +estimators. + +# References + + diff --git a/docs/articles/mkin.html b/docs/articles/mkin.html index 4159c427..76611010 100644 --- a/docs/articles/mkin.html +++ b/docs/articles/mkin.html @@ -1,146 +1,322 @@ -Introduction to mkin • mkin -
-
- - -
-
- - - - -
-

Wissenschaftlicher Berater, Kronacher Str. 8, 79639 Grenzach-Wyhlen, Germany
Privatdozent at the University of Bremen

+ + + + + + + + + + + + + + +Introduction to mkin + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + + +
+
+
+
+
+ +
+ + + + + + + +

Wissenschaftlicher Berater, Kronacher Str. 8, 79639 Grenzach-Wyhlen, Germany
Privatdozent at the University of Bremen

-

Abstract

+

Abstract

In the regulatory evaluation of chemical substances like plant protection products (pesticides), biocides and other chemicals, degradation data play an important role. For the evaluation of pesticide degradation experiments, detailed guidance has been developed, based on nonlinear optimisation. The R add-on package mkin (Ranke 2016) implements fitting some of the models recommended in this guidance from within R and calculates some statistical measures for data series within one or more compartments, for parent and metabolites.

-
library(mkin)
-# Define the kinetic model
-m_SFO_SFO_SFO <- mkinmod(parent = mkinsub("SFO", "M1"),
-                         M1 = mkinsub("SFO", "M2"),
-                         M2 = mkinsub("SFO"),
-                         use_of_ff = "max", quiet = TRUE)
-
-
-# Produce model predictions using some arbitrary parameters
-sampling_times = c(0, 1, 3, 7, 14, 28, 60, 90, 120)
-d_SFO_SFO_SFO <- mkinpredict(m_SFO_SFO_SFO,
-  c(k_parent = 0.03,
-    f_parent_to_M1 = 0.5, k_M1 = log(2)/100,
-    f_M1_to_M2 = 0.9, k_M2 = log(2)/50),
-  c(parent = 100, M1 = 0, M2 = 0),
+
library(mkin)
+# Define the kinetic model
+m_SFO_SFO_SFO <- mkinmod(parent = mkinsub("SFO", "M1"),
+                         M1 = mkinsub("SFO", "M2"),
+                         M2 = mkinsub("SFO"),
+                         use_of_ff = "max", quiet = TRUE)
+
+
+# Produce model predictions using some arbitrary parameters
+sampling_times = c(0, 1, 3, 7, 14, 28, 60, 90, 120)
+d_SFO_SFO_SFO <- mkinpredict(m_SFO_SFO_SFO,
+  c(k_parent = 0.03,
+    f_parent_to_M1 = 0.5, k_M1 = log(2)/100,
+    f_M1_to_M2 = 0.9, k_M2 = log(2)/50),
+  c(parent = 100, M1 = 0, M2 = 0),
   sampling_times)
 
-# Generate a dataset by adding normally distributed errors with
-# standard deviation 3, for two replicates at each sampling time
-d_SFO_SFO_SFO_err <- add_err(d_SFO_SFO_SFO, reps = 2,
-                             sdfunc = function(x) 3,
-                             n = 1, seed = 123456789 )
+# Generate a dataset by adding normally distributed errors with
+# standard deviation 3, for two replicates at each sampling time
+d_SFO_SFO_SFO_err <- add_err(d_SFO_SFO_SFO, reps = 2,
+                             sdfunc = function(x) 3,
+                             n = 1, seed = 123456789 )
 
-# Fit the model to the dataset
-f_SFO_SFO_SFO <- mkinfit(m_SFO_SFO_SFO, d_SFO_SFO_SFO_err[[1]], quiet = TRUE)
+# Fit the model to the dataset
+f_SFO_SFO_SFO <- mkinfit(m_SFO_SFO_SFO, d_SFO_SFO_SFO_err[[1]], quiet = TRUE)
 
-# Plot the results separately for parent and metabolites
-plot_sep(f_SFO_SFO_SFO, lpos = c("topright", "bottomright", "bottomright"))
-

+# Plot the results separately for parent and metabolites +plot_sep(f_SFO_SFO_SFO, lpos = c("topright", "bottomright", "bottomright"))
+

-

Background

+

Background

Many approaches are possible regarding the evaluation of chemical degradation data.

The now deprecated kinfit package (Ranke 2015) in R (R Development Core Team 2016) implements the approach recommended in the kinetics report provided by the FOrum for Co-ordination of pesticide fate models and their USe (FOCUS Work Group on Degradation Kinetics 2006; 2014) for simple data series for one parent compound in one compartment.

The mkin package (Ranke 2016) extends this approach to data series with transformation products, commonly termed metabolites, and to more than one compartment. It is also possible to include back reactions, so equilibrium reactions and equilibrium partitioning can be specified, although this oftentimes leads to an overparameterisation of the model.

-

When the first mkin code was published in 2010, the most commonly used tools for fitting more complex kinetic degradation models to experimental data were KinGUI (Schäfer et al. 2007), a MATLAB based tool with a graphical user interface that was specifically tailored to the task and included some output as proposed by the FOCUS Kinetics Workgroup, and ModelMaker, a general purpose compartment based tool providing infrastructure for fitting dynamic simulation models based on differential equations to data.

+

When the first mkin code was published in 2010, the most commonly used tools for fitting more complex kinetic degradation models to experimental data were KinGUI (Schäfer et al. 2007), a MATLAB based tool with a graphical user interface that was specifically tailored to the task and included some output as proposed by the FOCUS Kinetics Workgroup, and ModelMaker, a general purpose compartment based tool providing infrastructure for fitting dynamic simulation models based on differential equations to data.

The code was first uploaded to the BerliOS platform. When this was taken down, the version control history was imported into the R-Forge site, where the code is still mirrored today (see e.g. the initial commit on 11 May 2010).

At that time, the R package FME (Flexible Modelling Environment) (Soetaert and Petzoldt 2010) was already available, and provided a good basis for developing a package specifically tailored to the task. The remaining challenge was to make it as easy as possible for the users (including the author of this vignette) to specify the system of differential equations and to include the output requested by the FOCUS guidance, such as the relative standard deviation that has to be assumed for the residuals, such that the \(\chi^2\) goodness-of-fit test as defined by the FOCUS kinetics workgroup would pass using an significance level \(\alpha\) of 0.05.

Also, mkin introduced using analytical solutions for parent only kinetics for improved optimization speed. Later, Eigenvalue based solutions were introduced to mkin for the case of linear differential equations (i.e. where the FOMC or DFOP models were not used for the parent compound), greatly improving the optimization speed for these cases.

The possibility to specify back-reactions and a biphasic model (SFORB) for metabolites were present in mkin from the very beginning.

-

Derived software tools

+

Derived software tools

Soon after the publication of mkin, two derived tools were published, namely KinGUII (available from Bayer Crop Science) and CAKE (commissioned to Tessella by Syngenta), which added a graphical user interface (GUI), and added fitting by iteratively reweighted least squares (IRLS) and characterisation of likely parameter distributions by Markov Chain Monte Carlo (MCMC) sampling.

CAKE focuses on a smooth use experience, sacrificing some flexibility in the model definition, originally allowing only two primary metabolites in parallel. The current version 3.2 of CAKE release in March 2016 uses a basic scheme for up to six metabolites in a flexible arrangement, but does not support back-reactions (non-instantaneous equilibria) or biphasic kinetics for metabolites.

KinGUI offers an even more flexible widget for specifying complex kinetic models. Back-reactions (non-instanteneous equilibria) were supported early on, but until 2014, only simple first-order models could be specified for transformation products. Starting with KinGUII version 2.1, biphasic modelling of metabolites was also available in KinGUII.

A further graphical user interface (GUI) that has recently been brought to a decent degree of maturity is the browser based GUI named gmkin. Please see its documentation page and manual for further information.

-

Recent developments

+

Recent developments

Currently (June 2016), the main features available in mkin which are not present in KinGUII or CAKE, are the speed increase by using compiled code when a compiler is present, parallel model fitting on multicore machines using the mmkin function, and the estimation of parameter confidence intervals based on transformed parameters. These are explained in more detail below.

-

Internal parameter transformations

+

Internal parameter transformations

For rate constants, the log transformation is used, as proposed by Bates and Watts (1988, 77, 149). Approximate intervals are constructed for the transformed rate constants (compare Bates and Watts 1988, 135), i.e. for their logarithms. Confidence intervals for the rate constants are then obtained using the appropriate backtransformation using the exponential function.

In the first version of mkin allowing for specifying models using formation fractions, a home-made reparameterisation was used in order to ensure that the sum of formation fractions would not exceed unity.

This method is still used in the current version of KinGUII (v2.1 from April 2014), with a modification that allows for fixing the pathway to sink to zero. CAKE uses penalties in the objective function in order to enforce this constraint.

-

In 2012, an alternative reparameterisation of the formation fractions was proposed together with René Lehmann (Ranke and Lehmann 2012), based on isometric logratio transformation (ILR). The aim was to improve the validity of the linear approximation of the objective function during the parameter estimation procedure as well as in the subsequent calculation of parameter confidence intervals.

+

In 2012, an alternative reparameterisation of the formation fractions was proposed together with René Lehmann (Ranke and Lehmann 2012), based on isometric logratio transformation (ILR). The aim was to improve the validity of the linear approximation of the objective function during the parameter estimation procedure as well as in the subsequent calculation of parameter confidence intervals.

-

Confidence intervals based on transformed parameters

+

Confidence intervals based on transformed parameters

In the first attempt at providing improved parameter confidence intervals introduced to mkin in 2013, confidence intervals obtained from FME on the transformed parameters were simply all backtransformed one by one to yield asymetric confidence intervals for the backtransformed parameters.

However, while there is a 1:1 relation between the rate constants in the model and the transformed parameters fitted in the model, the parameters obtained by the isometric logratio transformation are calculated from the set of formation fractions that quantify the paths to each of the compounds formed from a specific parent compound, and no such 1:1 relation exists.

Therefore, parameter confidence intervals for formation fractions obtained with this method only appear valid for the case of a single transformation product, where only one formation fraction is to be estimated, directly corresponding to one component of the ilr transformed parameter.

The confidence intervals obtained by backtransformation for the cases where a 1:1 relation between transformed and original parameter exist are considered by the author of this vignette to be more accurate than those obtained using a re-estimation of the Hessian matrix after backtransformation, as implemented in the FME package.

-

Parameter t-test based on untransformed parameters

+

Parameter t-test based on untransformed parameters

The standard output of many nonlinear regression software packages includes the results from a test for significant difference from zero for all parameters. Such a test is also recommended to check the validity of rate constants in the FOCUS guidance (FOCUS Work Group on Degradation Kinetics 2014, 96ff).

It has been argued that the precondition for this test, i.e. normal distribution of the estimator for the parameters, is not fulfilled in the case of nonlinear regression (Ranke and Lehmann 2015). However, this test is commonly used by industry, consultants and national authorities in order to decide on the reliability of parameter estimates, based on the FOCUS guidance mentioned above. Therefore, the results of this one-sided t-test are included in the summary output from mkin.

As it is not reasonable to test for significant difference of the transformed parameters (e.g. \(log(k)\)) from zero, the t-test is calculated based on the model definition before parameter transformation, i.e. in a similar way as in packages that do not apply such an internal parameter transformation. A note is included in the mkin output, pointing to the fact that the t-test is based on the unjustified assumption of normal distribution of the parameter estimators.

-

References

+

References

@@ -150,59 +326,61 @@ f_SFO_SFO_SFO <- http://focus.jrc.ec.europa.eu/dk.

-

———. 2014. Generic Guidance for Estimating Persistence and Degradation Kinetics from Environmental Fate Studies on Pesticides in Eu Registration. 1.1 ed. http://focus.jrc.ec.europa.eu/dk.

+

———. 2014. Generic Guidance for Estimating Persistence and Degradation Kinetics from Environmental Fate Studies on Pesticides in Eu Registration. 1.1 ed. http://focus.jrc.ec.europa.eu/dk.

R Development Core Team. 2016. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org.

-

Ranke, J. 2015. ‘Kinfit‘: Routines for Fitting Simple Kinetic Models to Chemical Degradation Data. https://CRAN.R-project.org/package=kinfit.

+

Ranke, J. 2015. ‘Kinfit‘: Routines for Fitting Simple Kinetic Models to Chemical Degradation Data. https://CRAN.R-project.org/package=kinfit.

-

———. 2016. ‘Mkin‘: Kinetic Evaluation of Chemical Degradation Data. https://CRAN.R-project.org/package=mkin.

+

———. 2016. ‘Mkin‘: Kinetic Evaluation of Chemical Degradation Data. https://CRAN.R-project.org/package=mkin.

-

Ranke, J., and R. Lehmann. 2012. “Parameter Reliability in Kinetic Evaluation of Environmental Metabolism Data - Assessment and the Influence of Model Specification.” In SETAC World 20-24 May. Berlin.

+

Ranke, J., and R. Lehmann. 2012. “Parameter Reliability in Kinetic Evaluation of Environmental Metabolism Data - Assessment and the Influence of Model Specification.” In SETAC World 20-24 May. Berlin.

-

———. 2015. “To T-Test or Not to T-Test, That Is the Question.” In XV Symposium on Pesticide Chemistry 2-4 September 2015. Piacenza. http://chem.uft.uni-bremen.de/ranke/posters/piacenza_2015.pdf.

+

———. 2015. “To T-Test or Not to T-Test, That Is the Question.” In XV Symposium on Pesticide Chemistry 2-4 September 2015. Piacenza. http://chem.uft.uni-bremen.de/ranke/posters/piacenza_2015.pdf.

-

Schäfer, D., B. Mikolasch, P. Rainbird, and B. Harvey. 2007. “KinGUI: A New Kinetic Software Tool for Evaluations According to FOCUS Degradation Kinetics.” In Proceedings of the Xiii Symposium Pesticide Chemistry, edited by Del Re A. A. M., Capri E., Fragoulis G., and Trevisan M., 916–23. Piacenza.

+

Schäfer, D., B. Mikolasch, P. Rainbird, and B. Harvey. 2007. “KinGUI: A New Kinetic Software Tool for Evaluations According to FOCUS Degradation Kinetics.” In Proceedings of the Xiii Symposium Pesticide Chemistry, edited by Del Re A. A. M., Capri E., Fragoulis G., and Trevisan M., 916–23. Piacenza.

-

Soetaert, Karline, and Thomas Petzoldt. 2010. “Inverse Modelling, Sensitivity and Monte Carlo Analysis in R Using Package FME.” Journal of Statistical Software 33 (3): 1–28. http://www.jstatsoft.org/v33/i03/.

+

Soetaert, Karline, and Thomas Petzoldt. 2010. “Inverse Modelling, Sensitivity and Monte Carlo Analysis in R Using Package FME.” Journal of Statistical Software 33 (3): 1–28. http://www.jstatsoft.org/v33/i03/.

-
-
- - -
-
-
-

Site built with pkgdown.

-
+ + + + - + + diff --git a/docs/articles/twa.R b/docs/articles/twa.R new file mode 100644 index 00000000..8b137891 --- /dev/null +++ b/docs/articles/twa.R @@ -0,0 +1 @@ + diff --git a/docs/articles/twa.Rmd b/docs/articles/twa.Rmd new file mode 100644 index 00000000..60188223 --- /dev/null +++ b/docs/articles/twa.Rmd @@ -0,0 +1,57 @@ +--- +title: Calculation of time weighted average concentrations with mkin +author: Johannes Ranke +date: "`r Sys.Date()`" +bibliography: references.bib +vignette: > + %\VignetteEngine{knitr::rmarkdown} + %\VignetteIndexEntry{Calculation of time weighted average concentrations with mkin} + %\VignetteEncoding{UTF-8} +--- + +Since version 0.9.45.1 of the 'mkin' package, a function for calculating +time weighted average concentrations for decline kinetics (*i.e.* only +for the compound applied in the experiment) is included. + +Time weighted average concentrations for the DFOP model are calculated using +the formulas given in the FOCUS kinetics guidance [@FOCUSkinetics2014, p. 251]: + +SFO: + +$$c_\textrm{twa} = c_0 \frac{\left( 1 - e^{- k t} \right)}{ k t} $$ + +FOMC: + +$$c_\textrm{twa} = c_0 \frac{\beta}{t (1 - \alpha)} + \left( \left(\frac{t}{\beta} + 1 \right)^{1 - \alpha} - 1 \right) $$ + +DFOP: + +$$c_\textrm{twa} = \frac{c_0}{t} \left( + \frac{g}{k_1} \left( 1 - e^{- k_1 t} \right) + + \frac{1-g}{k_2} \left( 1 - e^{- k_2 t} \right) \right) $$ + +Often, the ratio between the time weighted average concentration $c_\textrm{twa}$ +and the initial concentration $c_0$ + +$$f_\textrm{twa} = \frac{c_\textrm{twa}}{c_0}$$ + +is needed. This can be calculated from the fitted initial concentration $c_0$ and +the time weighted average concentration $c_\textrm{twa}$, or directly from +the model parameters using the following formulas: + +SFO: + +$$f_\textrm{twa} = \frac{\left( 1 - e^{- k t} \right)}{k t} $$ + +FOMC: + +$$f_\textrm{twa} = \frac{\beta}{t (1 - \alpha)} + \left( \left(\frac{t}{\beta} + 1 \right)^{1 - \alpha} - 1 \right) $$ + +DFOP: + +$$f_\textrm{twa} = \frac{1}{t} \left( + \frac{g}{k_1} \left( 1 - e^{- k_1 t} \right) + + \frac{1-g}{k_2} \left( 1 - e^{- k_2 t} \right) \right) $$ + diff --git a/docs/reference/index.html b/docs/reference/index.html index e995a75b..2d58eeeb 100644 --- a/docs/reference/index.html +++ b/docs/reference/index.html @@ -282,15 +282,9 @@ -

mkin_wide_to_long

- -

Convert a dataframe with observations over time into long format

- - - -

mkin_long_to_wide

+

geometric_mean

-

Convert a dataframe from long to wide format

+

Calculate the geometric mean

@@ -300,9 +294,9 @@ -

print

+

max_twa_parent

-

Print mkinmod objects

+

Function to calculate maximum time weighted average concentrations from kinetic models fitted with mkinfit

@@ -312,21 +306,33 @@ -

transform_odeparms backtransform_odeparms

+

mkin_wide_to_long

-

Functions to transform and backtransform kinetic parameters for fitting

+

Convert a dataframe with observations over time into long format

-

ilr invilr

+

mkin_long_to_wide

-

Function to perform isometric log-ratio transformation

+

Convert a dataframe from long to wide format

-

geometric_mean

+

print

-

Calculate the geometric mean

+

Print mkinmod objects

+ + + +

transform_odeparms backtransform_odeparms

+ +

Functions to transform and backtransform kinetic parameters for fitting

+ + + +

ilr invilr

+ +

Function to perform isometric log-ratio transformation

diff --git a/docs/reference/max_twa_parent.html b/docs/reference/max_twa_parent.html new file mode 100644 index 00000000..aeaf65ce --- /dev/null +++ b/docs/reference/max_twa_parent.html @@ -0,0 +1,179 @@ + + + + + + + + +Function to calculate maximum time weighted average concentrations from kinetic models fitted with mkinfit — max_twa_parent • mkin + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+ + + +
+ +
+
+ + + +

This function calculates maximum moving window time weighted average concentrations +(TWAs) for kinetic models fitted with mkinfit. Currently, only +calculations for the parent are implemented for the SFO, FOMC and DFOP models, +using the analytical formulas given in the PEC soil section of the FOCUS +guidance.

+ + +
max_twa_parent(fit, windows)
+ +

Arguments

+ + + + + + + + + + +
fit

An object of class mkinfit.

windows

The width of the time windows for which the TWAs should be calculated.

+ +

Value

+ +

A numeric vector, named using the windows argument.

+ +

References

+ +

FOCUS (2006) “Guidance Document on Estimating Persistence and + Degradation Kinetics from Environmental Fate Studies on Pesticides in EU + Registration” Report of the FOCUS Work Group on Degradation Kinetics, + EC Document Reference Sanco/10058/2005 version 2.0, 434 pp, + http://esdac.jrc.ec.europa.eu/projects/degradation-kinetics

+ + +

Examples

+
fit <- mkinfit("FOMC", FOCUS_2006_C, quiet = TRUE) + max_twa_parent(fit, c(7, 21))
#> 7 21 +#> 34.71343 18.22124
+
+ +
+ + +
+ + + -- cgit v1.2.1