From 08e600c0eea6153b433659c08ea49aead5ffd932 Mon Sep 17 00:00:00 2001
From: Johannes Ranke <jranke@uni-bremen.de>
Date: Wed, 26 Oct 2022 10:36:19 +0200
Subject: Update multistart example code and vignette

---
 vignettes/web_only/multistart.rmd | 47 +++++++++++++++++++++++++++++----------
 1 file changed, 35 insertions(+), 12 deletions(-)

(limited to 'vignettes/web_only/multistart.rmd')

diff --git a/vignettes/web_only/multistart.rmd b/vignettes/web_only/multistart.rmd
index 8f349973..94f4ef98 100644
--- a/vignettes/web_only/multistart.rmd
+++ b/vignettes/web_only/multistart.rmd
@@ -1,7 +1,7 @@
 ---
 title: Short demo of the multistart method
 author: Johannes Ranke
-date: Last change 19 September 2022 (rebuilt `r Sys.Date()`)
+date: Last change 26 September 2022 (rebuilt `r Sys.Date()`)
 output:
   html_document
 vignette: >
@@ -10,9 +10,7 @@ vignette: >
   %\VignetteEncoding{UTF-8}
 ---
 
-This is a vignette, because the multistart method does not seem to work in
-pkgdown example code and I wanted to show the plots in the online docs.
-The dimethenamid data from 2018 from seven soils is used as example data.
+The dimethenamid data from 2018 from seven soils is used as example data in this vignette.
 
 ```{r}
 library(mkin)
@@ -33,27 +31,52 @@ random effects for all degradation parameters.
 ```{r}
 f_mmkin <- mmkin("DFOP", dmta_ds, error_model = "tc", cores = 7, quiet = TRUE)
 f_saem_full <- saem(f_mmkin)
+illparms(f_saem_full)
+```
+We see that not all variability parameters are identifiable. The `illparms`
+function tells us that the confidence interval for the standard deviation
+of 'log_k2' includes zero. We check this assessment using multiple runs
+with different starting values.
+
+```{r}
 f_saem_full_multi <- multistart(f_saem_full, n = 16, cores = 16)
 parhist(f_saem_full_multi)
 ```
 
-We see that not all variability parameters are identifiable, most problematic
-is the variance of k2. So we reduce the parameter distribution model by
-removing the intersoil variability for this parameter.
+This confirms that the variance of k2 is the most problematic parameter, so we
+reduce the parameter distribution model by removing the intersoil variability
+for k2.
 
 ```{r}
-f_saem_reduced <- update(f_saem_full, covariance.model = diag(c(1, 1, 0, 1)))
+f_saem_reduced <- update(f_saem_full, no_random_effect = "log_k2")
+illparms(f_saem_reduced)
 f_saem_reduced_multi <- multistart(f_saem_reduced, n = 16, cores = 16)
 parhist(f_saem_reduced_multi, lpos = "topright")
 ```
 
+The results confirm that all remaining parameters can be determined with sufficient
+certainty.
+
 We can also analyse the log-likelihoods obtained in the multiple runs:
 
 ```{r}
 llhist(f_saem_reduced_multi)
 ```
 
-The one run with the lower likelihood is probably responsible for the outliers
-in the boxplots above, and caused by some weird starting value combination. In
-any case, the converged values from the original fit (red circles) appear
-perfectly acceptable, supporting the choice of starting values made by mkin.
+The parameter histograms can be further improved by excluding the result with
+the low likelihood.
+
+```{r}
+parhist(f_saem_reduced_multi, lpos = "topright", llmin = -326, ylim = c(0.5, 2))
+```
+
+We can use the `anova` method to compare the models, including a likelihood ratio
+test if the models are nested.
+
+```{r}
+anova(f_saem_full, best(f_saem_reduced_multi), test = TRUE)
+```
+
+While AIC and BIC are lower for the reduced model, the likelihood ratio test
+does not indicate a significant difference between the fits.
+
-- 
cgit v1.2.1