From e5d1df9a9b1f0951d7dfbaf24eee4294470b73e2 Mon Sep 17 00:00:00 2001 From: Johannes Ranke Date: Thu, 17 Nov 2022 14:54:20 +0100 Subject: Complete update of online docs for v1.2.0 --- docs/articles/web_only/dimethenamid_2018.html | 158 +++++++++++++------------- 1 file changed, 76 insertions(+), 82 deletions(-) (limited to 'docs/articles/web_only/dimethenamid_2018.html') diff --git a/docs/articles/web_only/dimethenamid_2018.html b/docs/articles/web_only/dimethenamid_2018.html index b020a7b0..8c37edd6 100644 --- a/docs/articles/web_only/dimethenamid_2018.html +++ b/docs/articles/web_only/dimethenamid_2018.html @@ -33,7 +33,7 @@ mkin - 1.1.1 + 1.2.0 @@ -62,11 +62,14 @@ Example evaluations of dimethenamid data from 2018 with nonlinear mixed-effects models
  • - Example evaluation of FOCUS Example Dataset Z + Short demo of the multistart method
  • Performance benefit by using compiled model definitions in mkin
  • +
  • + Example evaluation of FOCUS Example Dataset Z +
  • Calculation of time weighted average concentrations with mkin
  • @@ -74,7 +77,10 @@ Example evaluation of NAFTA SOP Attachment examples
  • - Some benchmark timings + Benchmark timings for mkin +
  • +
  • + Benchmark timings for saem.mmkin
  • @@ -105,7 +111,7 @@

    Example evaluations of the dimethenamid data from 2018

    Johannes Ranke

    -

    Last change 1 July 2022, built on 08 Jul 2022

    +

    Last change 1 July 2022, built on 17 Nov 2022

    Source: vignettes/web_only/dimethenamid_2018.rmd @@ -155,20 +161,20 @@ error_model = "tc", quiet = TRUE)

    The plot of the individual SFO fits shown below suggests that at least in some datasets the degradation slows down towards later time points, and that the scatter of the residuals error is smaller for smaller values (panel to the right):

    -plot(mixed(f_parent_mkin_const["SFO", ]))
    +plot(mixed(f_parent_mkin_const["SFO", ]))

    Using biexponential decline (DFOP) results in a slightly more random scatter of the residuals:

    -plot(mixed(f_parent_mkin_const["DFOP", ]))
    +plot(mixed(f_parent_mkin_const["DFOP", ]))

    The population curve (bold line) in the above plot results from taking the mean of the individual transformed parameters, i.e. of log k1 and log k2, as well as of the logit of the g parameter of the DFOP model). Here, this procedure does not result in parameters that represent the degradation well, because in some datasets the fitted value for k2 is extremely close to zero, leading to a log k2 value that dominates the average. This is alleviated if only rate constants that pass the t-test for significant difference from zero (on the untransformed scale) are considered in the averaging:

    -plot(mixed(f_parent_mkin_const["DFOP", ]), test_log_parms = TRUE)
    +plot(mixed(f_parent_mkin_const["DFOP", ]), test_log_parms = TRUE)

    While this is visually much more satisfactory, such an average procedure could introduce a bias, as not all results from the individual fits enter the population curve with the same weight. This is where nonlinear mixed-effects models can help out by treating all datasets with equally by fitting a parameter distribution model together with the degradation model and the error model (see below).

    The remaining trend of the residuals to be higher for higher predicted residues is reduced by using the two-component error model:

    -plot(mixed(f_parent_mkin_tc["DFOP", ]), test_log_parms = TRUE)
    +plot(mixed(f_parent_mkin_tc["DFOP", ]), test_log_parms = TRUE)

    However, note that in the case of using this error model, the fits to the Flaach and BBA 2.3 datasets appear to be ill-defined, indicated by the fact that they did not converge:

    @@ -178,7 +184,7 @@ Status of individual fits:
     
           dataset
     model  Calke Borstel Flaach BBA 2.2 BBA 2.3 Elliot
    -  DFOP OK    OK      OK     OK      C       OK    
    +  DFOP OK    OK      C      OK      C       OK    
     
     OK: No warnings
     C: Optimisation did not converge:
    @@ -222,7 +228,7 @@ f_parent_nlme_dfop_tc       3 10 671.91 702.34 -325.96 2 vs 3  134.69  <.0001
     

    While the SFO variants converge fast, the additional parameters introduced by this lead to convergence warnings for the DFOP model. The model comparison clearly show that adding correlations between random effects does not improve the fits.

    The selected model (DFOP with two-component error) fitted to the data assuming no correlations between random effects is shown below.

    -plot(f_parent_nlme_dfop_tc)
    +plot(f_parent_nlme_dfop_tc)

    @@ -231,41 +237,32 @@ f_parent_nlme_dfop_tc 3 10 671.91 702.34 -325.96 2 vs 3 134.69 <.0001

    The saemix package provided the first Open Source implementation of the Stochastic Approximation to the Expectation Maximisation (SAEM) algorithm. SAEM fits of degradation models can be conveniently performed using an interface to the saemix package available in current development versions of the mkin package.

    The corresponding SAEM fits of the four combinations of degradation and error models are fitted below. As there is no convergence criterion implemented in the saemix package, the convergence plots need to be manually checked for every fit. We define control settings that work well for all the parent data fits shown in this vignette.

    -library(saemix)
    -
    Loading required package: npde
    -
    Package saemix, version 3.0
    -  please direct bugs, questions and feedback to emmanuelle.comets@inserm.fr
    -
    
    -Attaching package: 'saemix'
    -
    The following objects are masked from 'package:npde':
    -
    -    kurtosis, skewness
    -
    -saemix_control <- saemixControl(nbiter.saemix = c(800, 300), nb.chains = 15,
    +library(saemix)
    +saemix_control <- saemixControl(nbiter.saemix = c(800, 300), nb.chains = 15,
         print = FALSE, save = FALSE, save.graphs = FALSE, displayProgress = FALSE)
     saemix_control_moreiter <- saemixControl(nbiter.saemix = c(1600, 300), nb.chains = 15,
         print = FALSE, save = FALSE, save.graphs = FALSE, displayProgress = FALSE)
     saemix_control_10k <- saemixControl(nbiter.saemix = c(10000, 300), nb.chains = 15,
         print = FALSE, save = FALSE, save.graphs = FALSE, displayProgress = FALSE)

    The convergence plot for the SFO model using constant variance is shown below.

    -
    +
     f_parent_saemix_sfo_const <- mkin::saem(f_parent_mkin_const["SFO", ], quiet = TRUE,
       control = saemix_control, transformations = "saemix")
     plot(f_parent_saemix_sfo_const$so, plot.type = "convergence")

    Obviously the selected number of iterations is sufficient to reach convergence. This can also be said for the SFO fit using the two-component error model.

    -
    +
     f_parent_saemix_sfo_tc <- mkin::saem(f_parent_mkin_tc["SFO", ], quiet = TRUE,
       control = saemix_control, transformations = "saemix")
     plot(f_parent_saemix_sfo_tc$so, plot.type = "convergence")

    When fitting the DFOP model with constant variance (see below), parameter convergence is not as unambiguous.

    -
    +
     f_parent_saemix_dfop_const <- mkin::saem(f_parent_mkin_const["DFOP", ], quiet = TRUE,
       control = saemix_control, transformations = "saemix")
     plot(f_parent_saemix_dfop_const$so, plot.type = "convergence")

    -
    +
     print(f_parent_saemix_dfop_const)
    Kinetic nonlinear mixed-effects model fit by SAEM
     Structural model:
    @@ -286,23 +283,21 @@ DMTA_0    97.99583 96.50079 99.4909
     k1         0.06377  0.03432  0.0932
     k2         0.00848  0.00444  0.0125
     g          0.95701  0.91313  1.0009
    -a.1        1.82141  1.60516  2.0377
    -SD.DMTA_0  1.64787  0.45729  2.8384
    +a.1        1.82141  1.65974  1.9831
    +SD.DMTA_0  1.64787  0.45779  2.8379
     SD.k1      0.57439  0.24731  0.9015
    -SD.k2      0.03296 -2.50524  2.5712
    -SD.g       1.10266  0.32354  1.8818
    +SD.k2 0.03296 -2.50143 2.5673 +SD.g 1.10266 0.32371 1.8816

    While the other parameters converge to credible values, the variance of k2 (omega2.k2) converges to a very small value. The printout of the saem.mmkin model shows that the estimated standard deviation of k2 across the population of soils (SD.k2) is ill-defined, indicating overparameterisation of this model.

    When the DFOP model is fitted with the two-component error model, we also observe that the estimated variance of k2 becomes very small, while being ill-defined, as illustrated by the excessive confidence interval of SD.k2.

    -
    +
     f_parent_saemix_dfop_tc <- mkin::saem(f_parent_mkin_tc["DFOP", ], quiet = TRUE,
       control = saemix_control, transformations = "saemix")
     f_parent_saemix_dfop_tc_moreiter <- mkin::saem(f_parent_mkin_tc["DFOP", ], quiet = TRUE,
    -  control = saemix_control_moreiter, transformations = "saemix")
    -
    Likelihood cannot be computed by Importance Sampling.
    -
    -plot(f_parent_saemix_dfop_tc$so, plot.type = "convergence")
    + control = saemix_control_moreiter, transformations = "saemix") +plot(f_parent_saemix_dfop_tc$so, plot.type = "convergence")

    -
    +
     print(f_parent_saemix_dfop_tc)
    Kinetic nonlinear mixed-effects model fit by SAEM
     Structural model:
    @@ -318,21 +313,21 @@ Likelihood computed by importance sampling
       666 664   -323
     
     Fitted parameters:
    -          estimate     lower    upper
    -DMTA_0    9.82e+01  96.27937 100.1783
    -k1        6.41e-02   0.03333   0.0948
    -k2        8.56e-03   0.00608   0.0110
    -g         9.55e-01   0.91440   0.9947
    -a.1       1.07e+00   0.86542   1.2647
    -b.1       2.96e-02   0.02258   0.0367
    -SD.DMTA_0 2.04e+00   0.40629   3.6678
    -SD.k1     5.98e-01   0.25796   0.9373
    -SD.k2     5.28e-04 -58.93251  58.9336
    -SD.g      1.04e+00   0.36509   1.7083
    + estimate lower upper +DMTA_0 98.27617 96.3088 100.2436 +k1 0.06437 0.0337 0.0950 +k2 0.00880 0.0063 0.0113 +g 0.95249 0.9100 0.9949 +a.1 1.06161 0.8625 1.2607 +b.1 0.02967 0.0226 0.0367 +SD.DMTA_0 2.06075 0.4187 3.7028 +SD.k1 0.59357 0.2561 0.9310 +SD.k2 0.00292 -10.2960 10.3019 +SD.g 1.05725 0.3808 1.7337

    Doubling the number of iterations in the first phase of the algorithm leads to a slightly lower likelihood, and therefore to slightly higher AIC and BIC values. With even more iterations, the algorithm stops with an error message. This is related to the variance of k2 approximating zero and has been submitted as a bug to the saemix package, as the algorithm does not converge in this case.

    An alternative way to fit DFOP in combination with the two-component error model is to use the model formulation with transformed parameters as used per default in mkin. When using this option, convergence is slower, but eventually the algorithm stops as well with the same error message.

    The four combinations (SFO/const, SFO/tc, DFOP/const and DFOP/tc) and the version with increased iterations can be compared using the model comparison function of the saemix package:

    -
    +
     AIC_parent_saemix <- saemix::compare.saemix(
       f_parent_saemix_sfo_const$so,
       f_parent_saemix_sfo_tc$so,
    @@ -340,7 +335,7 @@ SD.g      1.04e+00   0.36509   1.7083
    f_parent_saemix_dfop_tc$so, f_parent_saemix_dfop_tc_moreiter$so)
    Likelihoods calculated by importance sampling
    -
    +
     rownames(AIC_parent_saemix) <- c(
       "SFO const", "SFO tc", "DFOP const", "DFOP tc", "DFOP tc more iterations")
     print(AIC_parent_saemix)
    @@ -348,10 +343,10 @@ SD.g 1.04e+00 0.36509 1.7083
    SFO const 796.38 795.34 SFO tc 798.38 797.13 DFOP const 705.75 703.88 -DFOP tc 665.72 663.63 -DFOP tc more iterations NaN NaN
    +DFOP tc 665.65 663.57 +DFOP tc more iterations 665.88 663.80

    In order to check the influence of the likelihood calculation algorithms implemented in saemix, the likelihood from Gaussian quadrature is added to the best fit, and the AIC values obtained from the three methods are compared.

    -
    +
     f_parent_saemix_dfop_tc$so <-
       saemix::llgq.saemix(f_parent_saemix_dfop_tc$so)
     AIC_parent_saemix_methods <- c(
    @@ -361,11 +356,11 @@ DFOP tc more iterations    NaN    NaN
    ) print(AIC_parent_saemix_methods)
        is     gq    lin 
    -665.72 665.88 665.15 
    +665.65 665.68 665.11

    The AIC values based on importance sampling and Gaussian quadrature are very similar. Using linearisation is known to be less accurate, but still gives a similar value.

    In order to illustrate that the comparison of the three method depends on the degree of convergence obtained in the fit, the same comparison is shown below for the fit using the defaults for the number of iterations and the number of MCMC chains.

    When using OpenBlas for linear algebra, there is a large difference in the values obtained with Gaussian quadrature, so the larger number of iterations makes a lot of difference. When using the LAPACK version coming with Debian Bullseye, the AIC based on Gaussian quadrature is almost the same as the one obtained with the other methods, also when using defaults for the fit.

    -
    +
     f_parent_saemix_dfop_tc_defaults <- mkin::saem(f_parent_mkin_tc["DFOP", ])
     f_parent_saemix_dfop_tc_defaults$so <-
       saemix::llgq.saemix(f_parent_saemix_dfop_tc_defaults$so)
    @@ -376,14 +371,14 @@ DFOP tc more iterations    NaN    NaN
    ) print(AIC_parent_saemix_methods_defaults)
        is     gq    lin 
    -668.91 663.61 667.40 
    +669.77 669.36 670.95

    Comparison

    The following table gives the AIC values obtained with both backend packages using the same control parameters (800 iterations burn-in, 300 iterations second phase, 15 chains).

    -
    +
     AIC_all <- data.frame(
       check.names = FALSE,
       "Degradation model" = c("SFO", "SFO", "DFOP", "DFOP"),
    @@ -408,7 +403,7 @@ DFOP tc more iterations    NaN    NaN
    SFO const 796.60 -794.17 +796.60 796.38 @@ -422,15 +417,15 @@ DFOP tc more iterations NaN NaN
    DFOP const NA -704.95 +671.98 705.75 DFOP tc 671.91 -665.15 -665.72 +665.11 +665.65 @@ -445,15 +440,15 @@ DFOP tc more iterations NaN NaN

    Session Info

    -
    +
    -
    R version 4.2.1 (2022-06-23)
    +
    R version 4.2.2 (2022-10-31)
     Platform: x86_64-pc-linux-gnu (64-bit)
     Running under: Debian GNU/Linux 11 (bullseye)
     
     Matrix products: default
    -BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
    -LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
    +BLAS:   /usr/lib/x86_64-linux-gnu/openblas-serial/libblas.so.3
    +LAPACK: /usr/lib/x86_64-linux-gnu/openblas-serial/libopenblas-r0.3.13.so
     
     locale:
      [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C              
    @@ -464,28 +459,27 @@ locale:
     [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C       
     
     attached base packages:
    -[1] parallel  stats     graphics  grDevices utils     datasets  methods  
    -[8] base     
    +[1] stats     graphics  grDevices utils     datasets  methods   base     
     
     other attached packages:
    -[1] nlme_3.1-158 mkin_1.1.1   knitr_1.39  
    +[1] nlme_3.1-160 mkin_1.2.0   knitr_1.40  
     
     loaded via a namespace (and not attached):
    - [1] deSolve_1.32      zoo_1.8-10        tidyselect_1.1.2  xfun_0.31        
    - [5] bslib_0.3.1       purrr_0.3.4       lattice_0.20-45   colorspace_2.0-3 
    - [9] vctrs_0.4.1       generics_0.1.3    htmltools_0.5.2   yaml_2.3.5       
    -[13] utf8_1.2.2        rlang_1.0.3       pkgdown_2.0.5     saemix_3.0       
    -[17] jquerylib_0.1.4   pillar_1.7.0      glue_1.6.2        lifecycle_1.0.1  
    -[21] stringr_1.4.0     munsell_0.5.0     gtable_0.3.0      ragg_1.2.2       
    -[25] memoise_2.0.1     evaluate_0.15     npde_3.2          fastmap_1.1.0    
    -[29] lmtest_0.9-40     fansi_1.0.3       highr_0.9         scales_1.2.0     
    -[33] cachem_1.0.6      desc_1.4.1        jsonlite_1.8.0    systemfonts_1.0.4
    -[37] fs_1.5.2          textshaping_0.3.6 gridExtra_2.3     ggplot2_3.3.6    
    -[41] digest_0.6.29     stringi_1.7.6     dplyr_1.0.9       grid_4.2.1       
    -[45] rprojroot_2.0.3   cli_3.3.0         tools_4.2.1       magrittr_2.0.3   
    -[49] sass_0.4.1        tibble_3.1.7      crayon_1.5.1      pkgconfig_2.0.3  
    -[53] ellipsis_0.3.2    rmarkdown_2.14    R6_2.5.1          mclust_5.4.10    
    -[57] compiler_4.2.1   
    + [1] deSolve_1.34 zoo_1.8-11 tidyselect_1.2.0 xfun_0.33 + [5] bslib_0.4.0 purrr_0.3.5 lattice_0.20-45 colorspace_2.0-3 + [9] vctrs_0.5.0 generics_0.1.3 htmltools_0.5.3 yaml_2.3.6 +[13] utf8_1.2.2 rlang_1.0.6 pkgdown_2.0.6 saemix_3.2 +[17] jquerylib_0.1.4 pillar_1.8.1 glue_1.6.2 DBI_1.1.3 +[21] lifecycle_1.0.3 stringr_1.4.1 munsell_0.5.0 gtable_0.3.1 +[25] ragg_1.2.2 memoise_2.0.1 evaluate_0.18 npde_3.2 +[29] fastmap_1.1.0 lmtest_0.9-40 parallel_4.2.2 fansi_1.0.3 +[33] highr_0.9 scales_1.2.1 cachem_1.0.6 desc_1.4.2 +[37] jsonlite_1.8.3 systemfonts_1.0.4 fs_1.5.2 textshaping_0.3.6 +[41] gridExtra_2.3 ggplot2_3.4.0 digest_0.6.30 stringi_1.7.8 +[45] dplyr_1.0.10 grid_4.2.2 rprojroot_2.0.3 cli_3.4.1 +[49] tools_4.2.2 magrittr_2.0.3 sass_0.4.2 tibble_3.1.8 +[53] pkgconfig_2.0.3 assertthat_0.2.1 rmarkdown_2.16 R6_2.5.1 +[57] mclust_6.0.0 compiler_4.2.2

    References @@ -522,7 +516,7 @@ loaded via a namespace (and not attached):

    -

    Site built with pkgdown 2.0.5.

    +

    Site built with pkgdown 2.0.6.

    -- cgit v1.2.1