Date: Wed, 28 Sep 2016 08:50:17 +0200 Subject: Add drat target and rebuild for drat --- GNUmakefile | 3 + build.log | 3 +- vignettes/FOCUS_D.html | 337 +++++++++++++++++ vignettes/FOCUS_L.html | 830 +++++++++++++++++++++++++++++++++++++++++ vignettes/FOCUS_Z.pdf | Bin 0 -> 239565 bytes vignettes/compiled_models.html | 332 +++++++++++++++++ vignettes/mkin.html | 355 ++++++++++++++++++ 7 files changed, 1859 insertions(+), 1 deletion(-) create mode 100644 vignettes/FOCUS_D.html create mode 100644 vignettes/FOCUS_L.html create mode 100644 vignettes/FOCUS_Z.pdf create mode 100644 vignettes/compiled_models.html create mode 100644 vignettes/mkin.html diff --git a/GNUmakefile b/GNUmakefile index 062f4ead..70ce6f68 100644 --- a/GNUmakefile +++ b/GNUmakefile @@ -106,6 +106,9 @@ winbuilder: build @echo "Uploading to R-devel on win-builder" curl -T $(TGZ) ftp://anonymous@win-builder.r-project.org/R-devel/ +drat: build + "$(RBIN)/Rscript" -e "drat::insertPackage('$(TGZ)', commit = TRUE)" + submit: @echo "\nHow about make test, make check, make winbuilder" @echo "\nIs the DESCRIPTION file up to date?" diff --git a/build.log b/build.log index a8446671..fcbd6702 100644 --- a/build.log +++ b/build.log @@ -5,6 +5,7 @@ * creating vignettes ... OK * checking for LF line-endings in source and make files * checking for empty or unneeded directories +Removed empty directory ‘mkin/docs/icons’ * looking to see if a ‘data/datalist’ file should be added -* building ‘mkin_0.9.44.tar.gz’ +* building ‘mkin_0.9.44.9000.tar.gz’ diff --git a/vignettes/FOCUS_D.html b/vignettes/FOCUS_D.html new file mode 100644 index 00000000..af9b2e9f --- /dev/null +++ b/vignettes/FOCUS_D.html @@ -0,0 +1,337 @@ + + + + + + + + + + + + + + + +Example evaluation of FOCUS Example Dataset D + + + + + + + + + + + + + + + + + + + + + + + + + + +

+ + + + + + + + + + + + + + + +

This is just a very simple vignette showing how to fit a degradation model for a parent compound with one transformation product using mkin. After loading the library we look a the data. We have observed concentrations in the column named value at the times specified in column time for the two observed variables named parent and m1.

library("mkin")

## Loading required package: minpack.lm

## Loading required package: rootSolve

## Loading required package: inline

## Loading required package: methods

## Loading required package: parallel

print(FOCUS_2006_D)

##      name time  value
+## 1  parent    0  99.46
+## 2  parent    0 102.04
+## 3  parent    1  93.50
+## 4  parent    1  92.50
+## 5  parent    3  63.23
+## 6  parent    3  68.99
+## 7  parent    7  52.32
+## 8  parent    7  55.13
+## 9  parent   14  27.27
+## 10 parent   14  26.64
+## 11 parent   21  11.50
+## 12 parent   21  11.64
+## 13 parent   35   2.85
+## 14 parent   35   2.91
+## 15 parent   50   0.69
+## 16 parent   50   0.63
+## 17 parent   75   0.05
+## 18 parent   75   0.06
+## 19 parent  100     NA
+## 20 parent  100     NA
+## 21 parent  120     NA
+## 22 parent  120     NA
+## 23     m1    0   0.00
+## 24     m1    0   0.00
+## 25     m1    1   4.84
+## 26     m1    1   5.64
+## 27     m1    3  12.91
+## 28     m1    3  12.96
+## 29     m1    7  22.97
+## 30     m1    7  24.47
+## 31     m1   14  41.69
+## 32     m1   14  33.21
+## 33     m1   21  44.37
+## 34     m1   21  46.44
+## 35     m1   35  41.22
+## 36     m1   35  37.95
+## 37     m1   50  41.19
+## 38     m1   50  40.01
+## 39     m1   75  40.09
+## 40     m1   75  33.85
+## 41     m1  100  31.04
+## 42     m1  100  33.13
+## 43     m1  120  25.15
+## 44     m1  120  33.31

Next we specify the degradation model: The parent compound degrades with simple first-order kinetics (SFO) to one metabolite named m1, which also degrades with SFO kinetics.

The call to mkinmod returns a degradation model. The differential equations represented in R code can be found in the character vector $diffs of the mkinmod object. If a C compiler (gcc) is installed and functional, the differential equation model will be compiled from auto-generated C code.

SFO_SFO <- mkinmod(parent = mkinsub("SFO", "m1"), m1 = mkinsub("SFO"))

## Successfully compiled differential equation model from auto-generated C code.

print(SFO_SFO$diffs)

##                                                       parent 
+## "d_parent = - k_parent_sink * parent - k_parent_m1 * parent" 
+##                                                           m1 
+##             "d_m1 = + k_parent_m1 * parent - k_m1_sink * m1"

We do the fitting without progress report (quiet = TRUE).

fit <- mkinfit(SFO_SFO, FOCUS_2006_D, quiet = TRUE)

A plot of the fit including a residual plot for both observed variables is obtained using the plot method for mkinfit objects.

plot(fit, show_residuals = TRUE)

Confidence intervals for the parameter estimates are obtained using the mkinparplot function.

mkinparplot(fit)

A comprehensive report of the results is obtained using the summary method for mkinfit objects.

summary(fit)

## mkin version:    0.9.44.9000 
+## R version:       3.3.1 
+## Date of fit:     Wed Sep 28 08:12:47 2016 
+## Date of summary: Wed Sep 28 08:12:48 2016 
+## 
+## Equations:
+## d_parent = - k_parent_sink * parent - k_parent_m1 * parent
+## d_m1 = + k_parent_m1 * parent - k_m1_sink * m1
+## 
+## Model predictions using solution type deSolve 
+## 
+## Fitted with method Port using 153 model solutions performed in 0.641 s
+## 
+## Weighting: none
+## 
+## Starting values for parameters to be optimised:
+##                  value   type
+## parent_0      100.7500  state
+## k_parent_sink   0.1000 deparm
+## k_parent_m1     0.1001 deparm
+## k_m1_sink       0.1002 deparm
+## 
+## Starting values for the transformed parameters actually optimised:
+##                        value lower upper
+## parent_0          100.750000  -Inf   Inf
+## log_k_parent_sink  -2.302585  -Inf   Inf
+## log_k_parent_m1    -2.301586  -Inf   Inf
+## log_k_m1_sink      -2.300587  -Inf   Inf
+## 
+## Fixed parameter values:
+##      value  type
+## m1_0     0 state
+## 
+## Optimised, transformed parameters with symmetric confidence intervals:
+##                   Estimate Std. Error  Lower   Upper
+## parent_0            99.600    1.61400 96.330 102.900
+## log_k_parent_sink   -3.038    0.07826 -3.197  -2.879
+## log_k_parent_m1     -2.980    0.04124 -3.064  -2.897
+## log_k_m1_sink       -5.248    0.13610 -5.523  -4.972
+## 
+## Parameter correlation:
+##                   parent_0 log_k_parent_sink log_k_parent_m1 log_k_m1_sink
+## parent_0           1.00000            0.6075        -0.06625       -0.1701
+## log_k_parent_sink  0.60752            1.0000        -0.08740       -0.6253
+## log_k_parent_m1   -0.06625           -0.0874         1.00000        0.4716
+## log_k_m1_sink     -0.17006           -0.6253         0.47163        1.0000
+## 
+## Residual standard error: 3.211 on 36 degrees of freedom
+## 
+## Backtransformed parameters:
+## Confidence intervals for internally transformed parameters are asymmetric.
+## t-test (unrealistically) based on the assumption of normal distribution
+## for estimators of untransformed parameters.
+##                Estimate t value    Pr(>t)     Lower     Upper
+## parent_0      99.600000  61.720 2.024e-38 96.330000 1.029e+02
+## k_parent_sink  0.047920  12.780 3.050e-15  0.040890 5.616e-02
+## k_parent_m1    0.050780  24.250 3.407e-24  0.046700 5.521e-02
+## k_m1_sink      0.005261   7.349 5.758e-09  0.003992 6.933e-03
+## 
+## Chi2 error levels in percent:
+##          err.min n.optim df
+## All data   6.398       4 15
+## parent     6.827       3  6
+## m1         4.490       1  9
+## 
+## Resulting formation fractions:
+##                 ff
+## parent_sink 0.4855
+## parent_m1   0.5145
+## m1_sink     1.0000
+## 
+## Estimated disappearance times:
+##           DT50   DT90
+## parent   7.023  23.33
+## m1     131.761 437.70
+## 
+## Data:
+##  time variable observed predicted   residual
+##     0   parent    99.46 9.960e+01 -1.385e-01
+##     0   parent   102.04 9.960e+01  2.442e+00
+##     1   parent    93.50 9.024e+01  3.262e+00
+##     1   parent    92.50 9.024e+01  2.262e+00
+##     3   parent    63.23 7.407e+01 -1.084e+01
+##     3   parent    68.99 7.407e+01 -5.083e+00
+##     7   parent    52.32 4.991e+01  2.408e+00
+##     7   parent    55.13 4.991e+01  5.218e+00
+##    14   parent    27.27 2.501e+01  2.257e+00
+##    14   parent    26.64 2.501e+01  1.627e+00
+##    21   parent    11.50 1.253e+01 -1.035e+00
+##    21   parent    11.64 1.253e+01 -8.946e-01
+##    35   parent     2.85 3.148e+00 -2.979e-01
+##    35   parent     2.91 3.148e+00 -2.379e-01
+##    50   parent     0.69 7.162e-01 -2.624e-02
+##    50   parent     0.63 7.162e-01 -8.624e-02
+##    75   parent     0.05 6.074e-02 -1.074e-02
+##    75   parent     0.06 6.074e-02 -7.382e-04
+##   100   parent       NA 5.151e-03         NA
+##   100   parent       NA 5.151e-03         NA
+##   120   parent       NA 7.155e-04         NA
+##   120   parent       NA 7.155e-04         NA
+##     0       m1     0.00 0.000e+00  0.000e+00
+##     0       m1     0.00 0.000e+00  0.000e+00
+##     1       m1     4.84 4.803e+00  3.704e-02
+##     1       m1     5.64 4.803e+00  8.370e-01
+##     3       m1    12.91 1.302e+01 -1.140e-01
+##     3       m1    12.96 1.302e+01 -6.400e-02
+##     7       m1    22.97 2.504e+01 -2.075e+00
+##     7       m1    24.47 2.504e+01 -5.748e-01
+##    14       m1    41.69 3.669e+01  5.000e+00
+##    14       m1    33.21 3.669e+01 -3.480e+00
+##    21       m1    44.37 4.165e+01  2.717e+00
+##    21       m1    46.44 4.165e+01  4.787e+00
+##    35       m1    41.22 4.331e+01 -2.093e+00
+##    35       m1    37.95 4.331e+01 -5.363e+00
+##    50       m1    41.19 4.122e+01 -2.831e-02
+##    50       m1    40.01 4.122e+01 -1.208e+00
+##    75       m1    40.09 3.645e+01  3.643e+00
+##    75       m1    33.85 3.645e+01 -2.597e+00
+##   100       m1    31.04 3.198e+01 -9.416e-01
+##   100       m1    33.13 3.198e+01  1.148e+00
+##   120       m1    25.15 2.879e+01 -3.640e+00
+##   120       m1    33.31 2.879e+01  4.520e+00

+ + + + +

+ + + + + + diff --git a/vignettes/FOCUS_L.html b/vignettes/FOCUS_L.html new file mode 100644 index 00000000..16fa2ac0 --- /dev/null +++ b/vignettes/FOCUS_L.html @@ -0,0 +1,830 @@ + + + + + + + + + + + + + + + +Example evaluation of FOCUS Laboratory Data L1 to L3 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

+ + + + + + + + + + + + + + + +

+ +

+ + + + + + + +

Laboratory Data L1

The following code defines example dataset L1 from the FOCUS kinetics report, p. 284:

library("mkin", quietly = TRUE)
+FOCUS_2006_L1 = data.frame(
+  t = rep(c(0, 1, 2, 3, 5, 7, 14, 21, 30), each = 2),
+  parent = c(88.3, 91.4, 85.6, 84.5, 78.9, 77.6, 
+             72.0, 71.9, 50.3, 59.4, 47.0, 45.1,
+             27.7, 27.3, 10.0, 10.4, 2.9, 4.0))
+FOCUS_2006_L1_mkin <- mkin_wide_to_long(FOCUS_2006_L1)

Here we use the assumptions of simple first order (SFO), the case of declining rate constant over time (FOMC) and the case of two different phases of the kinetics (DFOP). For a more detailed discussion of the models, please see the FOCUS kinetics report.

Since mkin version 0.9-32 (July 2014), we can use shorthand notation like "SFO" for parent only degradation models. The following two lines fit the model and produce the summary report of the model fit. This covers the numerical analysis given in the FOCUS report.

m.L1.SFO <- mkinfit("SFO", FOCUS_2006_L1_mkin, quiet = TRUE)
+summary(m.L1.SFO)

## mkin version:    0.9.44.9000 
+## R version:       3.3.1 
+## Date of fit:     Wed Sep 28 08:12:49 2016 
+## Date of summary: Wed Sep 28 08:12:49 2016 
+## 
+## Equations:
+## d_parent = - k_parent_sink * parent
+## 
+## Model predictions using solution type analytical 
+## 
+## Fitted with method Port using 37 model solutions performed in 0.092 s
+## 
+## Weighting: none
+## 
+## Starting values for parameters to be optimised:
+##               value   type
+## parent_0      89.85  state
+## k_parent_sink  0.10 deparm
+## 
+## Starting values for the transformed parameters actually optimised:
+##                       value lower upper
+## parent_0          89.850000  -Inf   Inf
+## log_k_parent_sink -2.302585  -Inf   Inf
+## 
+## Fixed parameter values:
+## None
+## 
+## Optimised, transformed parameters with symmetric confidence intervals:
+##                   Estimate Std. Error  Lower  Upper
+## parent_0            92.470    1.36800 89.570 95.370
+## log_k_parent_sink   -2.347    0.04057 -2.433 -2.261
+## 
+## Parameter correlation:
+##                   parent_0 log_k_parent_sink
+## parent_0            1.0000            0.6248
+## log_k_parent_sink   0.6248            1.0000
+## 
+## Residual standard error: 2.948 on 16 degrees of freedom
+## 
+## Backtransformed parameters:
+## Confidence intervals for internally transformed parameters are asymmetric.
+## t-test (unrealistically) based on the assumption of normal distribution
+## for estimators of untransformed parameters.
+##               Estimate t value    Pr(>t)    Lower   Upper
+## parent_0      92.47000   67.58 2.170e-21 89.57000 95.3700
+## k_parent_sink  0.09561   24.65 1.867e-14  0.08773  0.1042
+## 
+## Chi2 error levels in percent:
+##          err.min n.optim df
+## All data   3.424       2  7
+## parent     3.424       2  7
+## 
+## Resulting formation fractions:
+##             ff
+## parent_sink  1
+## 
+## Estimated disappearance times:
+##         DT50  DT90
+## parent 7.249 24.08
+## 
+## Data:
+##  time variable observed predicted residual
+##     0   parent     88.3    92.471  -4.1710
+##     0   parent     91.4    92.471  -1.0710
+##     1   parent     85.6    84.039   1.5610
+##     1   parent     84.5    84.039   0.4610
+##     2   parent     78.9    76.376   2.5241
+##     2   parent     77.6    76.376   1.2241
+##     3   parent     72.0    69.412   2.5884
+##     3   parent     71.9    69.412   2.4884
+##     5   parent     50.3    57.330  -7.0301
+##     5   parent     59.4    57.330   2.0699
+##     7   parent     47.0    47.352  -0.3515
+##     7   parent     45.1    47.352  -2.2515
+##    14   parent     27.7    24.247   3.4528
+##    14   parent     27.3    24.247   3.0528
+##    21   parent     10.0    12.416  -2.4163
+##    21   parent     10.4    12.416  -2.0163
+##    30   parent      2.9     5.251  -2.3513
+##    30   parent      4.0     5.251  -1.2513

A plot of the fit is obtained with the plot function for mkinfit objects.

plot(m.L1.SFO, show_errmin = TRUE, main = "FOCUS L1 - SFO")

The residual plot can be easily obtained by

mkinresplot(m.L1.SFO, ylab = "Observed", xlab = "Time")

For comparison, the FOMC model is fitted as well, and the χ² error level is checked.

m.L1.FOMC <- mkinfit("FOMC", FOCUS_2006_L1_mkin, quiet=TRUE)

## Warning in mkinfit("FOMC", FOCUS_2006_L1_mkin, quiet = TRUE): Optimisation by method Port did not converge.
+## Convergence code is 1

plot(m.L1.FOMC, show_errmin = TRUE, main = "FOCUS L1 - FOMC")

summary(m.L1.FOMC, data = FALSE)

## mkin version:    0.9.44.9000 
+## R version:       3.3.1 
+## Date of fit:     Wed Sep 28 08:12:49 2016 
+## Date of summary: Wed Sep 28 08:12:49 2016 
+## 
+## 
+## Warning: Optimisation by method Port did not converge.
+## Convergence code is 1 
+## 
+## 
+## Equations:
+## d_parent = - (alpha/beta) * 1/((time/beta) + 1) * parent
+## 
+## Model predictions using solution type analytical 
+## 
+## Fitted with method Port using 188 model solutions performed in 0.43 s
+## 
+## Weighting: none
+## 
+## Starting values for parameters to be optimised:
+##          value   type
+## parent_0 89.85  state
+## alpha     1.00 deparm
+## beta     10.00 deparm
+## 
+## Starting values for the transformed parameters actually optimised:
+##               value lower upper
+## parent_0  89.850000  -Inf   Inf
+## log_alpha  0.000000  -Inf   Inf
+## log_beta   2.302585  -Inf   Inf
+## 
+## Fixed parameter values:
+## None
+## 
+## Optimised, transformed parameters with symmetric confidence intervals:
+##           Estimate Std. Error  Lower Upper
+## parent_0     92.47      1.422  89.44 95.50
+## log_alpha    15.43     15.080 -16.71 47.58
+## log_beta     17.78     15.090 -14.37 49.93
+## 
+## Parameter correlation:
+##           parent_0 log_alpha log_beta
+## parent_0    1.0000    0.1129   0.1112
+## log_alpha   0.1129    1.0000   1.0000
+## log_beta    0.1112    1.0000   1.0000
+## 
+## Residual standard error: 3.045 on 15 degrees of freedom
+## 
+## Backtransformed parameters:
+## Confidence intervals for internally transformed parameters are asymmetric.
+## t-test (unrealistically) based on the assumption of normal distribution
+## for estimators of untransformed parameters.
+##           Estimate t value    Pr(>t)     Lower     Upper
+## parent_0 9.247e+01  65.150 4.044e-20 8.944e+01 9.550e+01
+## alpha    5.044e+06   1.271 1.115e-01 5.510e-08 4.618e+20
+## beta     5.276e+07   1.259 1.137e-01 5.732e-07 4.857e+21
+## 
+## Chi2 error levels in percent:
+##          err.min n.optim df
+## All data   3.619       3  6
+## parent     3.619       3  6
+## 
+## Estimated disappearance times:
+##        DT50  DT90 DT50back
+## parent 7.25 24.08     7.25

We get a warning that the default optimisation algorithm Port did not converge, which is an indication that the model is overparameterised, i.e. contains too many parameters that are ill-defined as a consequence.

And in fact, due to the higher number of parameters, and the lower number of degrees of freedom of the fit, the χ² error level is actually higher for the FOMC model (3.6%) than for the SFO model (3.4%). Additionally, the parameters log_alpha and log_beta internally fitted in the model have excessive confidence intervals, that span more than 25 orders of magnitude (!) when backtransformed to the scale of alpha and beta. Also, the t-test for significant difference from zero does not indicate such a significant difference, with p-values greater than 0.1, and finally, the parameter correlation of log_alpha and log_beta is 1.000, clearly indicating that the model is overparameterised.

The χ² error levels reported in Appendix 3 and Appendix 7 to the FOCUS kinetics report are rounded to integer percentages and partly deviate by one percentage point from the results calculated by mkin. The reason for this is not known. However, mkin gives the same χ² error levels as the kinfit package and the calculation routines of the kinfit package have been extensively compared to the results obtained by the KinGUI software, as documented in the kinfit package vignette. KinGUI was the first widely used standard package in this field. Also, the calculation of χ² error levels was compared with KinGUII, CAKE and DegKin manager in a project sponsored by the German Umweltbundesamt (Ranke).

Laboratory Data L2

The following code defines example dataset L2 from the FOCUS kinetics report, p. 287:

FOCUS_2006_L2 = data.frame(
+  t = rep(c(0, 1, 3, 7, 14, 28), each = 2),
+  parent = c(96.1, 91.8, 41.4, 38.7,
+             19.3, 22.3, 4.6, 4.6,
+             2.6, 1.2, 0.3, 0.6))
+FOCUS_2006_L2_mkin <- mkin_wide_to_long(FOCUS_2006_L2)

SFO fit for L2

Again, the SFO model is fitted and the result is plotted. The residual plot can be obtained simply by adding the argument show_residuals to the plot command.

m.L2.SFO <- mkinfit("SFO", FOCUS_2006_L2_mkin, quiet=TRUE)
+plot(m.L2.SFO, show_residuals = TRUE, show_errmin = TRUE, 
+     main = "FOCUS L2 - SFO")

The χ² error level of 14% suggests that the model does not fit very well. This is also obvious from the plots of the fit, in which we have included the residual plot.

In the FOCUS kinetics report, it is stated that there is no apparent systematic error observed from the residual plot up to the measured DT90 (approximately at day 5), and there is an underestimation beyond that point.

We may add that it is difficult to judge the random nature of the residuals just from the three samplings at days 0, 1 and 3. Also, it is not clear a priori why a consistent underestimation after the approximate DT90 should be irrelevant. However, this can be rationalised by the fact that the FOCUS fate models generally only implement SFO kinetics.

FOMC fit for L2

For comparison, the FOMC model is fitted as well, and the χ² error level is checked.

m.L2.FOMC <- mkinfit("FOMC", FOCUS_2006_L2_mkin, quiet = TRUE)
+plot(m.L2.FOMC, show_residuals = TRUE,
+     main = "FOCUS L2 - FOMC")

summary(m.L2.FOMC, data = FALSE)

## mkin version:    0.9.44.9000 
+## R version:       3.3.1 
+## Date of fit:     Wed Sep 28 08:12:50 2016 
+## Date of summary: Wed Sep 28 08:12:50 2016 
+## 
+## Equations:
+## d_parent = - (alpha/beta) * 1/((time/beta) + 1) * parent
+## 
+## Model predictions using solution type analytical 
+## 
+## Fitted with method Port using 81 model solutions performed in 0.177 s
+## 
+## Weighting: none
+## 
+## Starting values for parameters to be optimised:
+##          value   type
+## parent_0 93.95  state
+## alpha     1.00 deparm
+## beta     10.00 deparm
+## 
+## Starting values for the transformed parameters actually optimised:
+##               value lower upper
+## parent_0  93.950000  -Inf   Inf
+## log_alpha  0.000000  -Inf   Inf
+## log_beta   2.302585  -Inf   Inf
+## 
+## Fixed parameter values:
+## None
+## 
+## Optimised, transformed parameters with symmetric confidence intervals:
+##           Estimate Std. Error   Lower   Upper
+## parent_0   93.7700     1.8560 89.5700 97.9700
+## log_alpha   0.3180     0.1867 -0.1044  0.7405
+## log_beta    0.2102     0.2943 -0.4555  0.8759
+## 
+## Parameter correlation:
+##           parent_0 log_alpha log_beta
+## parent_0   1.00000  -0.09553  -0.1863
+## log_alpha -0.09553   1.00000   0.9757
+## log_beta  -0.18628   0.97567   1.0000
+## 
+## Residual standard error: 2.628 on 9 degrees of freedom
+## 
+## Backtransformed parameters:
+## Confidence intervals for internally transformed parameters are asymmetric.
+## t-test (unrealistically) based on the assumption of normal distribution
+## for estimators of untransformed parameters.
+##          Estimate t value    Pr(>t)   Lower  Upper
+## parent_0   93.770  50.510 1.173e-12 89.5700 97.970
+## alpha       1.374   5.355 2.296e-04  0.9009  2.097
+## beta        1.234   3.398 3.949e-03  0.6341  2.401
+## 
+## Chi2 error levels in percent:
+##          err.min n.optim df
+## All data   6.205       3  3
+## parent     6.205       3  3
+## 
+## Estimated disappearance times:
+##          DT50  DT90 DT50back
+## parent 0.8092 5.356    1.612

The error level at which the χ² test passes is much lower in this case. Therefore, the FOMC model provides a better description of the data, as less experimental error has to be assumed in order to explain the data.

DFOP fit for L2

Fitting the four parameter DFOP model further reduces the χ² error level.

m.L2.DFOP <- mkinfit("DFOP", FOCUS_2006_L2_mkin, quiet = TRUE)
+plot(m.L2.DFOP, show_residuals = TRUE, show_errmin = TRUE,
+     main = "FOCUS L2 - DFOP")

summary(m.L2.DFOP, data = FALSE)

## mkin version:    0.9.44.9000 
+## R version:       3.3.1 
+## Date of fit:     Wed Sep 28 08:12:51 2016 
+## Date of summary: Wed Sep 28 08:12:51 2016 
+## 
+## Equations:
+## d_parent = - ((k1 * g * exp(-k1 * time) + k2 * (1 - g) * exp(-k2 *
+##            time)) / (g * exp(-k1 * time) + (1 - g) * exp(-k2 *
+##            time))) * parent
+## 
+## Model predictions using solution type analytical 
+## 
+## Fitted with method Port using 336 model solutions performed in 0.79 s
+## 
+## Weighting: none
+## 
+## Starting values for parameters to be optimised:
+##          value   type
+## parent_0 93.95  state
+## k1        0.10 deparm
+## k2        0.01 deparm
+## g         0.50 deparm
+## 
+## Starting values for the transformed parameters actually optimised:
+##              value lower upper
+## parent_0 93.950000  -Inf   Inf
+## log_k1   -2.302585  -Inf   Inf
+## log_k2   -4.605170  -Inf   Inf
+## g_ilr     0.000000  -Inf   Inf
+## 
+## Fixed parameter values:
+## None
+## 
+## Optimised, transformed parameters with symmetric confidence intervals:
+##          Estimate Std. Error Lower Upper
+## parent_0  93.9500         NA    NA    NA
+## log_k1     3.1370         NA    NA    NA
+## log_k2    -1.0880         NA    NA    NA
+## g_ilr     -0.2821         NA    NA    NA
+## 
+## Parameter correlation:

## Warning in print.summary.mkinfit(x): Could not estimate covariance matrix; singular system:

## Could not estimate covariance matrix; singular system:
+## 
+## Residual standard error: 1.732 on 8 degrees of freedom
+## 
+## Backtransformed parameters:
+## Confidence intervals for internally transformed parameters are asymmetric.
+## t-test (unrealistically) based on the assumption of normal distribution
+## for estimators of untransformed parameters.
+##          Estimate t value Pr(>t) Lower Upper
+## parent_0  93.9500      NA     NA    NA    NA
+## k1        23.0400      NA     NA    NA    NA
+## k2         0.3369      NA     NA    NA    NA
+## g          0.4016      NA     NA    NA    NA
+## 
+## Chi2 error levels in percent:
+##          err.min n.optim df
+## All data    2.53       4  2
+## parent      2.53       4  2
+## 
+## Estimated disappearance times:
+##          DT50  DT90 DT50_k1 DT50_k2
+## parent 0.5335 5.311 0.03009   2.058

Here, the DFOP model is clearly the best-fit model for dataset L2 based on the chi^2 error level criterion. However, the failure to calculate the covariance matrix indicates that the parameter estimates correlate excessively. Therefore, the FOMC model may be preferred for this dataset.

Laboratory Data L3

The following code defines example dataset L3 from the FOCUS kinetics report, p. 290.

FOCUS_2006_L3 = data.frame(
+  t = c(0, 3, 7, 14, 30, 60, 91, 120),
+  parent = c(97.8, 60, 51, 43, 35, 22, 15, 12))
+FOCUS_2006_L3_mkin <- mkin_wide_to_long(FOCUS_2006_L3)

Use mmkin to fit multiple models

As of mkin version 0.9-39 (June 2015), we can fit several models to one or more datasets in one call to the function mmkin. The datasets have to be passed in a list, in this case a named list holding only the L3 dataset prepared above.

# Only use one core here, not to offend the CRAN checks
+mm.L3 <- mmkin(c("SFO", "FOMC", "DFOP"), cores = 1,
+               list("FOCUS L3" = FOCUS_2006_L3_mkin), quiet = TRUE)
+plot(mm.L3)

The χ² error level of 21% as well as the plot suggest that the SFO model does not fit very well. The FOMC model performs better, with an error level at which the χ² test passes of 7%. Fitting the four parameter DFOP model further reduces the χ² error level considerably.

Accessing elements of mmkin objects

The objects returned by mmkin are arranged like a matrix, with models as a row index and datasets as a column index.

We can extract the summary and plot for e.g. the DFOP fit, using square brackets for indexing which will result in the use of the summary and plot functions working on mkinfit objects.

summary(mm.L3[["DFOP", 1]])

## mkin version:    0.9.44.9000 
+## R version:       3.3.1 
+## Date of fit:     Wed Sep 28 08:12:51 2016 
+## Date of summary: Wed Sep 28 08:12:51 2016 
+## 
+## Equations:
+## d_parent = - ((k1 * g * exp(-k1 * time) + k2 * (1 - g) * exp(-k2 *
+##            time)) / (g * exp(-k1 * time) + (1 - g) * exp(-k2 *
+##            time))) * parent
+## 
+## Model predictions using solution type analytical 
+## 
+## Fitted with method Port using 137 model solutions performed in 0.32 s
+## 
+## Weighting: none
+## 
+## Starting values for parameters to be optimised:
+##          value   type
+## parent_0 97.80  state
+## k1        0.10 deparm
+## k2        0.01 deparm
+## g         0.50 deparm
+## 
+## Starting values for the transformed parameters actually optimised:
+##              value lower upper
+## parent_0 97.800000  -Inf   Inf
+## log_k1   -2.302585  -Inf   Inf
+## log_k2   -4.605170  -Inf   Inf
+## g_ilr     0.000000  -Inf   Inf
+## 
+## Fixed parameter values:
+## None
+## 
+## Optimised, transformed parameters with symmetric confidence intervals:
+##          Estimate Std. Error   Lower     Upper
+## parent_0  97.7500    1.43800 93.7500 101.70000
+## log_k1    -0.6612    0.13340 -1.0310  -0.29100
+## log_k2    -4.2860    0.05902 -4.4500  -4.12200
+## g_ilr     -0.1229    0.05121 -0.2651   0.01925
+## 
+## Parameter correlation:
+##          parent_0  log_k1   log_k2   g_ilr
+## parent_0  1.00000  0.1640  0.01315  0.4253
+## log_k1    0.16400  1.0000  0.46478 -0.5526
+## log_k2    0.01315  0.4648  1.00000 -0.6631
+## g_ilr     0.42526 -0.5526 -0.66310  1.0000
+## 
+## Residual standard error: 1.439 on 4 degrees of freedom
+## 
+## Backtransformed parameters:
+## Confidence intervals for internally transformed parameters are asymmetric.
+## t-test (unrealistically) based on the assumption of normal distribution
+## for estimators of untransformed parameters.
+##          Estimate t value    Pr(>t)    Lower     Upper
+## parent_0 97.75000  67.970 1.404e-07 93.75000 101.70000
+## k1        0.51620   7.499 8.460e-04  0.35650   0.74750
+## k2        0.01376  16.940 3.557e-05  0.01168   0.01621
+## g         0.45660  25.410 7.121e-06  0.40730   0.50680
+## 
+## Chi2 error levels in percent:
+##          err.min n.optim df
+## All data   2.225       4  4
+## parent     2.225       4  4
+## 
+## Estimated disappearance times:
+##         DT50 DT90 DT50_k1 DT50_k2
+## parent 7.464  123   1.343   50.37
+## 
+## Data:
+##  time variable observed predicted residual
+##     0   parent     97.8     97.75  0.05396
+##     3   parent     60.0     60.45 -0.44933
+##     7   parent     51.0     49.44  1.56338
+##    14   parent     43.0     43.84 -0.83632
+##    30   parent     35.0     35.15 -0.14707
+##    60   parent     22.0     23.26 -1.25919
+##    91   parent     15.0     15.18 -0.18181
+##   120   parent     12.0     10.19  1.81395

plot(mm.L3[["DFOP", 1]], show_errmin = TRUE)

Here, a look to the model plot, the confidence intervals of the parameters and the correlation matrix suggest that the parameter estimates are reliable, and the DFOP model can be used as the best-fit model based on the χ² error level criterion for laboratory data L3.

This is also an example where the standard t-test for the parameter g_ilr is misleading, as it tests for a significant difference from zero. In this case, zero appears to be the correct value for this parameter, and the confidence interval for the backtransformed parameter g is quite narrow.

Laboratory Data L4

The following code defines example dataset L4 from the FOCUS kinetics report, p. 293:

FOCUS_2006_L4 = data.frame(
+  t = c(0, 3, 7, 14, 30, 60, 91, 120),
+  parent = c(96.6, 96.3, 94.3, 88.8, 74.9, 59.9, 53.5, 49.0))
+FOCUS_2006_L4_mkin <- mkin_wide_to_long(FOCUS_2006_L4)

Fits of the SFO and FOMC models, plots and summaries are produced below:

# Only use one core here, not to offend the CRAN checks
+mm.L4 <- mmkin(c("SFO", "FOMC"), cores = 1,
+               list("FOCUS L4" = FOCUS_2006_L4_mkin), 
+               quiet = TRUE)
+plot(mm.L4)

The χ² error level of 3.3% as well as the plot suggest that the SFO model fits very well. The error level at which the χ² test passes is slightly lower for the FOMC model. However, the difference appears negligible.

summary(mm.L4[["SFO", 1]], data = FALSE)

## mkin version:    0.9.44.9000 
+## R version:       3.3.1 
+## Date of fit:     Wed Sep 28 08:12:52 2016 
+## Date of summary: Wed Sep 28 08:12:52 2016 
+## 
+## Equations:
+## d_parent = - k_parent_sink * parent
+## 
+## Model predictions using solution type analytical 
+## 
+## Fitted with method Port using 46 model solutions performed in 0.104 s
+## 
+## Weighting: none
+## 
+## Starting values for parameters to be optimised:
+##               value   type
+## parent_0       96.6  state
+## k_parent_sink   0.1 deparm
+## 
+## Starting values for the transformed parameters actually optimised:
+##                       value lower upper
+## parent_0          96.600000  -Inf   Inf
+## log_k_parent_sink -2.302585  -Inf   Inf
+## 
+## Fixed parameter values:
+## None
+## 
+## Optimised, transformed parameters with symmetric confidence intervals:
+##                   Estimate Std. Error  Lower   Upper
+## parent_0             96.44    1.94900 91.670 101.200
+## log_k_parent_sink    -5.03    0.07999 -5.225  -4.834
+## 
+## Parameter correlation:
+##                   parent_0 log_k_parent_sink
+## parent_0            1.0000            0.5865
+## log_k_parent_sink   0.5865            1.0000
+## 
+## Residual standard error: 3.651 on 6 degrees of freedom
+## 
+## Backtransformed parameters:
+## Confidence intervals for internally transformed parameters are asymmetric.
+## t-test (unrealistically) based on the assumption of normal distribution
+## for estimators of untransformed parameters.
+##                Estimate t value    Pr(>t)     Lower     Upper
+## parent_0      96.440000   49.49 2.283e-09 91.670000 1.012e+02
+## k_parent_sink  0.006541   12.50 8.008e-06  0.005378 7.955e-03
+## 
+## Chi2 error levels in percent:
+##          err.min n.optim df
+## All data   3.287       2  6
+## parent     3.287       2  6
+## 
+## Resulting formation fractions:
+##             ff
+## parent_sink  1
+## 
+## Estimated disappearance times:
+##        DT50 DT90
+## parent  106  352

summary(mm.L4[["FOMC", 1]], data = FALSE)

## mkin version:    0.9.44.9000 
+## R version:       3.3.1 
+## Date of fit:     Wed Sep 28 08:12:52 2016 
+## Date of summary: Wed Sep 28 08:12:52 2016 
+## 
+## Equations:
+## d_parent = - (alpha/beta) * 1/((time/beta) + 1) * parent
+## 
+## Model predictions using solution type analytical 
+## 
+## Fitted with method Port using 66 model solutions performed in 0.148 s
+## 
+## Weighting: none
+## 
+## Starting values for parameters to be optimised:
+##          value   type
+## parent_0  96.6  state
+## alpha      1.0 deparm
+## beta      10.0 deparm
+## 
+## Starting values for the transformed parameters actually optimised:
+##               value lower upper
+## parent_0  96.600000  -Inf   Inf
+## log_alpha  0.000000  -Inf   Inf
+## log_beta   2.302585  -Inf   Inf
+## 
+## Fixed parameter values:
+## None
+## 
+## Optimised, transformed parameters with symmetric confidence intervals:
+##           Estimate Std. Error  Lower    Upper
+## parent_0   99.1400     1.6800 94.820 103.5000
+## log_alpha  -0.3506     0.3725 -1.308   0.6068
+## log_beta    4.1740     0.5635  2.726   5.6230
+## 
+## Parameter correlation:
+##           parent_0 log_alpha log_beta
+## parent_0    1.0000   -0.5365  -0.6083
+## log_alpha  -0.5365    1.0000   0.9913
+## log_beta   -0.6083    0.9913   1.0000
+## 
+## Residual standard error: 2.315 on 5 degrees of freedom
+## 
+## Backtransformed parameters:
+## Confidence intervals for internally transformed parameters are asymmetric.
+## t-test (unrealistically) based on the assumption of normal distribution
+## for estimators of untransformed parameters.
+##          Estimate t value    Pr(>t)   Lower   Upper
+## parent_0  99.1400  59.020 1.322e-08 94.8200 103.500
+## alpha      0.7042   2.685 2.178e-02  0.2703   1.835
+## beta      64.9800   1.775 6.807e-02 15.2600 276.600
+## 
+## Chi2 error levels in percent:
+##          err.min n.optim df
+## All data   2.029       3  5
+## parent     2.029       3  5
+## 
+## Estimated disappearance times:
+##         DT50 DT90 DT50back
+## parent 108.9 1644    494.9

References

Ranke, Johannes. Prüfung und Validierung von Modellierungssoftware als Alternative zu ModelMaker 4.0. Umweltbundesamt Projektnummer 27452.

+ + + +

+ +

+ + + + + + diff --git a/vignettes/FOCUS_Z.pdf b/vignettes/FOCUS_Z.pdf new file mode 100644 index 00000000..bc37c873 Binary files /dev/null and b/vignettes/FOCUS_Z.pdf differ diff --git a/vignettes/compiled_models.html b/vignettes/compiled_models.html new file mode 100644 index 00000000..7a9537c7 --- /dev/null +++ b/vignettes/compiled_models.html @@ -0,0 +1,332 @@ + + + + + + + + + + + + + + + +Performance benefit by using compiled model definitions in mkin + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

+ + + + + + + + + + + + + + + +

+ +

+ + + + + + + +

Benchmark for a model that can also be solved with Eigenvalues

This evaluation is taken from the example section of mkinfit. When using an mkin version equal to or greater than 0.9-36 and a C compiler (gcc) is available, you will see a message that the model is being compiled from autogenerated C code when defining a model using mkinmod. The mkinmod() function checks for presence of the gcc compiler using

Sys.which("gcc")

##            gcc 
+## "/usr/bin/gcc"

First, we build a simple degradation model for a parent compound with one metabolite.

library("mkin")

## Loading required package: minpack.lm

## Loading required package: rootSolve

## Loading required package: inline

## Loading required package: methods

## Loading required package: parallel

SFO_SFO <- mkinmod(
+  parent = mkinsub("SFO", "m1"),
+  m1 = mkinsub("SFO"))

## Successfully compiled differential equation model from auto-generated C code.

We can compare the performance of the Eigenvalue based solution against the compiled version and the R implementation of the differential equations using the microbenchmark package.

library("microbenchmark")
+library("ggplot2")
+mb.1 <- microbenchmark(
+  "deSolve, not compiled" = mkinfit(SFO_SFO, FOCUS_2006_D, 
+                                    solution_type = "deSolve", 
+                                    use_compiled = FALSE, quiet = TRUE),
+  "Eigenvalue based" = mkinfit(SFO_SFO, FOCUS_2006_D, 
+                               solution_type = "eigen", quiet = TRUE),
+  "deSolve, compiled" = mkinfit(SFO_SFO, FOCUS_2006_D, 
+                                solution_type = "deSolve", quiet = TRUE),
+  times = 3, control = list(warmup = 0))

## Warning in microbenchmark(`deSolve, not compiled` = mkinfit(SFO_SFO,
+## FOCUS_2006_D, : Could not measure overhead. Your clock might lack
+## precision.

smb.1 <- summary(mb.1)
+print(mb.1)

## Unit: milliseconds
+##                   expr       min        lq      mean    median        uq
+##  deSolve, not compiled 5191.7067 5234.0766 5254.1638 5276.4465 5285.3923
+##       Eigenvalue based  837.7118  839.5009  848.7273  841.2900  854.2351
+##      deSolve, compiled  725.9412  738.4609  776.5436  750.9807  801.8448
+##        max neval cld
+##  5294.3381     3   b
+##   867.1802     3  a 
+##   852.7088     3  a

autoplot(mb.1)

We see that using the compiled model is by a factor of 7 faster than using the R version with the default ode solver, and it is even faster than the Eigenvalue based solution implemented in R which does not need iterative solution of the ODEs:

rownames(smb.1) <- smb.1$expr
+smb.1["median"]/smb.1["deSolve, compiled", "median"]

##                         median
+## deSolve, not compiled 7.026075
+## Eigenvalue based      1.120255
+## deSolve, compiled     1.000000

Benchmark for a model that can not be solved with Eigenvalues

This evaluation is also taken from the example section of mkinfit.

FOMC_SFO <- mkinmod(
+  parent = mkinsub("FOMC", "m1"),
+  m1 = mkinsub( "SFO"))

## Successfully compiled differential equation model from auto-generated C code.

mb.2 <- microbenchmark(
+  "deSolve, not compiled" = mkinfit(FOMC_SFO, FOCUS_2006_D, 
+                                    use_compiled = FALSE, quiet = TRUE),
+  "deSolve, compiled" = mkinfit(FOMC_SFO, FOCUS_2006_D, quiet = TRUE),
+  times = 3, control = list(warmup = 0))

## Warning in microbenchmark(`deSolve, not compiled` = mkinfit(FOMC_SFO,
+## FOCUS_2006_D, : Could not measure overhead. Your clock might lack
+## precision.

smb.2 <- summary(mb.2)
+print(mb.2)

## Unit: seconds
+##                   expr       min        lq      mean   median        uq
+##  deSolve, not compiled 10.977718 11.003379 11.044286 11.02904 11.077570
+##      deSolve, compiled  1.289028  1.296324  1.332644  1.30362  1.354451
+##        max neval cld
+##  11.126100     3   b
+##   1.405282     3  a

smb.2["median"]/smb.2["deSolve, compiled", "median"]

##   median
+## 1     NA
+## 2     NA

autoplot(mb.2)

Here we get a performance benefit of a factor of 8.5 using the version of the differential equation model compiled from C code!

This vignette was built with mkin 0.9.44.9000 on

## R version 3.3.1 (2016-06-21)
+## Platform: x86_64-pc-linux-gnu (64-bit)
+## Running under: Debian GNU/Linux 8 (jessie)

## CPU model: Intel(R) Core(TM) i7-4710MQ CPU @ 2.50GHz

+ + + +

+ +

+ + + + + + diff --git a/vignettes/mkin.html b/vignettes/mkin.html new file mode 100644 index 00000000..857acb30 --- /dev/null +++ b/vignettes/mkin.html @@ -0,0 +1,355 @@ + + + + + + + + + + + + + + + +mkin - Kinetic evaluation of chemical degradation data + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

+ + + + + + + + + + + + + + + + + +

+ +

+ + + + + + + +

Wissenschaftlicher Berater, Kronacher Str. 8, 79639 Grenzach-Wyhlen, Germany
Privatdozent at the University of Bremen

Abstract

In the regulatory evaluation of chemical substances like plant protection products (pesticides), biocides and other chemicals, degradation data play an important role. For the evaluation of pesticide degradation experiments, detailed guidance has been developed, based on nonlinear optimisation. The R add-on package mkin (Ranke 2016) implements fitting some of the models recommended in this guidance from within R and calculates some statistical measures for data series within one or more compartments, for parent and metabolites.

require(mkin)

## Loading required package: mkin

## Loading required package: minpack.lm

## Loading required package: rootSolve

## Loading required package: inline

## Loading required package: methods

## Loading required package: parallel

m_SFO_SFO_SFO <- mkinmod(parent = mkinsub("SFO", "M1"),
+                         M1 = mkinsub("SFO", "M2"),
+                         M2 = mkinsub("SFO"), 
+                         use_of_ff = "max", quiet = TRUE)
+
+sampling_times = c(0, 1, 3, 7, 14, 28, 60, 90, 120)
+
+d_SFO_SFO_SFO <- mkinpredict(m_SFO_SFO_SFO,
+  c(k_parent = 0.03, 
+    f_parent_to_M1 = 0.5, k_M1 = log(2)/100, 
+    f_M1_to_M2 = 0.9, k_M2 = log(2)/50),
+  c(parent = 100, M1 = 0, M2 = 0),
+  sampling_times)
+
+d_SFO_SFO_SFO_err <- add_err(d_SFO_SFO_SFO, function(x) 3, n = 1, seed = 123456789 )
+
+f_SFO_SFO_SFO <- mkinfit(m_SFO_SFO_SFO, d_SFO_SFO_SFO_err[[1]], quiet = TRUE)
+
+plot_sep(f_SFO_SFO_SFO, lpos = c("topright", "bottomright", "bottomright"))

Background

Many approaches are possible regarding the evaluation of chemical degradation data.

The now deprecated kinfit package (Ranke 2015) in R (R Development Core Team 2016) implements the approach recommended in the kinetics report provided by the FOrum for Co-ordination of pesticide fate models and their USe (FOCUS Work Group on Degradation Kinetics 2006; 2014) for simple data series for one parent compound in one compartment.

The mkin package (Ranke 2016) extends this approach to data series with transformation products, commonly termed metabolites, and to more than one compartment. It is also possible to include back reactions, so equilibrium reactions and equilibrium partitioning can be specified, although this oftentimes leads to an overparameterisation of the model.

When the first mkin code was published in 2010, the most commonly used tools for fitting more complex kinetic degradation models to experimental data were KinGUI (Schäfer et al. 2007), a MATLAB based tool with a graphical user interface that was specifically tailored to the task and included some output as proposed by the FOCUS Kinetics Workgroup, and ModelMaker, a general purpose compartment based tool providing infrastructure for fitting dynamic simulation models based on differential equations to data.

The code was first uploaded to the BerliOS platform. When this was taken down, the version control history was imported into the R-Forge site, where the code is still mirrored today (see e.g. the initial commit on 11 May 2010).

At that time, the R package FME (Flexible Modelling Environment) (Soetaert and Petzoldt 2010) was already available, and provided a good basis for developing a package specifically tailored to the task. The remaining challenge was to make it as easy as possible for the users (including the author of this vignette) to specify the system of differential equations and to include the output requested by the FOCUS guidance, such as the relative standard deviation that has to be assumed for the residuals, such that the $\chi^2$ goodness-of-fit test as defined by the FOCUS kinetics workgroup would pass using an significance level $\alpha$ of 0.05.

Also, mkin introduced using analytical solutions for parent only kinetics for improved optimization speed. Later, Eigenvalue based solutions were introduced to mkin for the case of linear differential equations (i.e. where the FOMC or DFOP models were not used for the parent compound), greatly improving the optimization speed for these cases.

The possibility to specify back-reactions and a biphasic model (SFORB) for metabolites were present in mkin from the very beginning.

Derived software tools

Soon after the publication of mkin, two derived tools were published, namely KinGUII (available from Bayer Crop Science) and CAKE (commissioned to Tessella by Syngenta), which added a graphical user interface (GUI), and added fitting by iteratively reweighted least squares (IRLS) and characterisation of likely parameter distributions by Markov Chain Monte Carlo (MCMC) sampling.

CAKE focuses on a smooth use experience, sacrificing some flexibility in the model definition, originally allowing only two primary metabolites in parallel. The current version 3.2 of CAKE release in March 2016 uses a basic scheme for up to six metabolites in a flexible arrangement, but does not support back-reactions (non-instantaneous equilibria) or biphasic kinetics for metabolites.

KinGUI offers an even more flexible widget for specifying complex kinetic models. Back-reactions (non-instanteneous equilibria) were supported early on, but until 2014, only simple first-order models could be specified for transformation products. Starting with KinGUII version 2.1, biphasic modelling of metabolites was also available in KinGUII.

A further graphical user interface (GUI) that has recently been brought to a decent degree of maturity is the browser based GUI named gmkin. Please see its documentation page and manual for further information.

Recent developments

Currently (June 2016), the main features available in mkin which are not present in KinGUII or CAKE, are the speed increase by using compiled code when a compiler is present, parallel model fitting on multicore machines using the mmkin function, and the estimation of parameter confidence intervals based on transformed parameters. These are explained in more detail below.

Internal parameter transformations

For rate constants, the log transformation is used, as proposed by Bates and Watts (1988, 77, 149). Approximate intervals are constructed for the transformed rate constants (compare Bates and Watts 1988, 135), i.e. for their logarithms. Confidence intervals for the rate constants are then obtained using the appropriate backtransformation using the exponential function.

In the first version of mkin allowing for specifying models using formation fractions, a home-made reparameterisation was used in order to ensure that the sum of formation fractions would not exceed unity.

This method is still used in the current version of KinGUII (v2.1 from April 2014), with a modification that allows for fixing the pathway to sink to zero. CAKE uses penalties in the objective function in order to enforce this constraint.

In 2012, an alternative reparameterisation of the formation fractions was proposed together with René Lehmann (Ranke and Lehmann 2012), based on isometric logratio transformation (ILR). The aim was to improve the validity of the linear approximation of the objective function during the parameter estimation procedure as well as in the subsequent calculation of parameter confidence intervals.

Confidence intervals based on transformed parameters

In the first attempt at providing improved parameter confidence intervals introduced to mkin in 2013, confidence intervals obtained from FME on the transformed parameters were simply all backtransformed one by one to yield asymetric confidence intervals for the backtransformed parameters.

However, while there is a 1:1 relation between the rate constants in the model and the transformed parameters fitted in the model, the parameters obtained by the isometric logratio transformation are calculated from the set of formation fractions that quantify the paths to each of the compounds formed from a specific parent compound, and no such 1:1 relation exists.

Therefore, parameter confidence intervals for formation fractions obtained with this method only appear valid for the case of a single transformation product, where only one formation fraction is to be estimated, directly corresponding to one component of the ilr transformed parameter.

The confidence intervals obtained by backtransformation for the cases where a 1:1 relation between transformed and original parameter exist are considered by the author of this vignette to be more accurate than those obtained using a re-estimation of the Hessian matrix after backtransformation, as implemented in the FME package.

Parameter t-test based on untransformed parameters

The standard output of many nonlinear regression software packages includes the results from a test for significant difference from zero for all parameters. Such a test is also recommended to check the validity of rate constants in the FOCUS guidance (FOCUS Work Group on Degradation Kinetics 2014, 96ff).

It has been argued that the precondition for this test, i.e. normal distribution of the estimator for the parameters, is not fulfilled in the case of nonlinear regression (Ranke and Lehmann 2015). However, this test is commonly used by industry, consultants and national authorities in order to decide on the reliability of parameter estimates, based on the FOCUS guidance mentioned above. Therefore, the results of this one-sided t-test are included in the summary output from mkin.

As it is not reasonable to test for significant difference of the transformed parameters (e.g. $log(k)$) from zero, the t-test is calculated based on the model definition before parameter transformation, i.e. in a similar way as in packages that do not apply such an internal parameter transformation. A note is included in the mkin output, pointing to the fact that the t-test is based on the unjustified assumption of normal distribution of the parameter estimators.

References

+ +

Bates, D., and D. Watts. 1988. Nonlinear Regression and Its Applications. Wiley-Interscience.

FOCUS Work Group on Degradation Kinetics. 2006. Guidance Document on Estimating Persistence and Degradation Kinetics from Environmental Fate Studies on Pesticides in EU Registration. Report of the FOCUS Work Group on Degradation Kinetics. http://focus.jrc.ec.europa.eu/dk.

———. 2014. Generic Guidance for Estimating Persistence and Degradation Kinetics from Environmental Fate Studies on Pesticides in EU Registration. 1.1. http://focus.jrc.ec.europa.eu/dk.

R Development Core Team. 2016. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. http://www.R-project.org.

Ranke, J. 2015. `kinfit`: Routines for Fitting Simple Kinetic Models to Chemical Degradation Data. http://CRAN.R-project.org/package=kinfit.

———. 2016. `mkin`: Kinetic Evaluation of Chemical Degradation Data. http://CRAN.R-project.org/package=mkin.

Ranke, J., and R. Lehmann. 2012. “Parameter Reliability in Kinetic Evaluation of Environmental Metabolism Data - Assessment and the Influence of Model Specification.” In SETAC World 20-24 May. Berlin.

———. 2015. “To T-Test or Not to T-Test, That Is the Question.” In XV Symposium on Pesticide Chemistry 2-4 September 2015. Piacenza. http://chem.uft.uni-bremen.de/ranke/posters/piacenza_2015.pdf.

Schäfer, D., B. Mikolasch, P. Rainbird, and B. Harvey. 2007. “KinGUI: a New Kinetic Software Tool for Evaluations According to FOCUS Degradation Kinetics.” In Proceedings of the XIII Symposium Pesticide Chemistry, edited by Del Re A. A. M., Capri E., Fragoulis G., and Trevisan M., 916–23. Piacenza.

Soetaert, Karline, and Thomas Petzoldt. 2010. “Inverse Modelling, Sensitivity and Monte Carlo Analysis in R Using Package FME.” Journal of Statistical Software 33 (3): 1–28. http://www.jstatsoft.org/v33/i03/.

+ + + +

+ +

+ + + + + + + + -- cgit v1.2.1

Example evaluation of FOCUS Example Dataset D

Johannes Ranke

2016-09-28

Example evaluation of FOCUS Laboratory Data L1 to L3

Johannes Ranke

2016-09-28

Laboratory Data L1

Laboratory Data L2

SFO fit for L2

FOMC fit for L2

DFOP fit for L2

Laboratory Data L3

Use mmkin to fit multiple models

Accessing elements of mmkin objects

Laboratory Data L4

References

Performance benefit by using compiled model definitions in mkin

Johannes Ranke

2016-09-28

Benchmark for a model that can also be solved with Eigenvalues

Benchmark for a model that can not be solved with Eigenvalues

mkin - Kinetic evaluation of chemical degradation data

Johannes Ranke

2016-09-28

Abstract

Background

Derived software tools

Recent developments

Internal parameter transformations

Confidence intervals based on transformed parameters

Parameter t-test based on untransformed parameters

References