% $Id: examples.Rnw 66 2010-09-03 08:50:26Z jranke $ %%\VignetteIndexEntry{Examples for kinetic evaluations using mkin} %%VignetteDepends{FME} %%\usepackage{Sweave} \documentclass[12pt,a4paper]{article} \usepackage{a4wide} %%\usepackage[lists,heads]{endfloat} \input{header} \hypersetup{ pdftitle = {Examples for kinetic evaluations using mkin}, pdfsubject = {Manuscript}, pdfauthor = {Johannes Ranke}, colorlinks = {true}, linkcolor = {blue}, citecolor = {blue}, urlcolor = {red}, hyperindex = {true}, linktocpage = {true}, } \SweaveOpts{engine=R, eps=FALSE, keep.source = TRUE} <<setup, echo = FALSE, results = hide>>= options(prompt = "R> ") options(width = 70) options(SweaveHooks = list( cex = function() par(cex.lab = 1.3, cex.axis = 1.3))) @ \begin{document} \title{Examples for kinetic evaluations using mkin} \author{\textbf{Johannes Ranke} \\[0.5cm] %EndAName Eurofins Regulatory AG\\ Weidenweg 15, CH--4310 Rheinfelden, Switzerland\\[0.5cm] and\\[0.5cm] University of Bremen\\ } \maketitle %\begin{abstract} %\end{abstract} \thispagestyle{empty} \setcounter{page}{0} \clearpage \tableofcontents \textbf{Key words}: Kinetics, FOCUS, nonlinear optimisation \section{Kinetic evaluations for parent compounds} These examples are also evaluated in a parallel vignette of the \Rpackage{kinfit} package \citep{pkg:kinfit}. The datasets are from Appendix 3, of the FOCUS kinetics report \citep{FOCUS2006, FOCUSkinetics2011}. \subsection{Laboratory Data L1} The following code defines example dataset L1 from the FOCUS kinetics report, p. 284 <<FOCUS_2006_L1_data, echo=TRUE, eval=TRUE>>= library("mkin") FOCUS_2006_L1 = data.frame( t = rep(c(0, 1, 2, 3, 5, 7, 14, 21, 30), each = 2), parent = c(88.3, 91.4, 85.6, 84.5, 78.9, 77.6, 72.0, 71.9, 50.3, 59.4, 47.0, 45.1, 27.7, 27.3, 10.0, 10.4, 2.9, 4.0)) FOCUS_2006_L1_mkin <- mkin_wide_to_long(FOCUS_2006_L1) @ The next step is to set up the models used for the kinetic analysis. Note that the model definitions contain the names of the observed variables in the data. In this case, there is only one variable called \texttt{parent}. <<Simple_models, echo=TRUE>>= SFO <- mkinmod(parent = list(type = "SFO")) FOMC <- mkinmod(parent = list(type = "FOMC")) DFOP <- mkinmod(parent = list(type = "DFOP")) @ The three models cover the first assumption of simple first order (SFO), the case of declining rate constant over time (FOMC) and the case of two different phases of the kinetics (DFOP). For a more detailed discussion of the models, please see the FOCUS kinetics report. The following two lines fit the model and produce the summary report of the model fit. This covers the numerical analysis given in the FOCUS report. <<L1_SFO, echo=TRUE>>= m.L1.SFO <- mkinfit(SFO, FOCUS_2006_L1_mkin, quiet=TRUE) summary(m.L1.SFO) @ A plot of the fit is obtained with the plot function for mkinfit objects. <<L1_SFO_plot, fig=TRUE, echo=TRUE, height=4>>= plot(m.L1.SFO) @ The residual plot can be easily obtained by <<L1_SFO_residuals, fig=TRUE, echo=TRUE, height=4>>= mkinresplot(m.L1.SFO, ylab = "Observed", xlab = "Time") @ For comparison, the FOMC model is fitted as well, and the $\chi^2$ error level is checked. <<L1_FOMC, echo=TRUE>>= m.L1.FOMC <- mkinfit(FOMC, FOCUS_2006_L1_mkin, quiet=TRUE) summary(m.L1.FOMC) @ Due to the higher number of parameters, and the lower number of degrees of freedom of the fit, the $\chi^2$ error level is actually higher for the FOMC model (3.6\%) than for the SFO model (3.4\%). Additionally, the covariance matrix can not be obtained, indicating overparameterisation of the model. The $\chi^2$ error levels reported in Appendix 3 and Appendix 7 to the FOCUS kinetics report are rounded to integer percentages and partly deviate by one percentage point from the results calculated by \texttt{mkin}. The reason for this is not known. However, \texttt{mkin} gives the same $\chi^2$ error levels as the \Rpackage{kinfit} package. Furthermore, the calculation routines of the kinfit package have been extensively compared to the results obtained by the KinGUI software, as documented in the kinfit package vignette. KinGUI is a widely used standard package in this field. Therefore, the reason for the difference was not investigated further. \subsection{Laboratory Data L2} The following code defines example dataset L2 from the FOCUS kinetics report, p. 287 <<FOCUS_2006_L2_data, echo=TRUE, eval=TRUE>>= FOCUS_2006_L2 = data.frame( t = rep(c(0, 1, 3, 7, 14, 28), each = 2), parent = c(96.1, 91.8, 41.4, 38.7, 19.3, 22.3, 4.6, 4.6, 2.6, 1.2, 0.3, 0.6)) FOCUS_2006_L2_mkin <- mkin_wide_to_long(FOCUS_2006_L2) @ Again, the SFO model is fitted and a summary is obtained. <<L2_SFO, echo=TRUE>>= m.L2.SFO <- mkinfit(SFO, FOCUS_2006_L2_mkin, quiet=TRUE) summary(m.L2.SFO) @ The $\chi^2$ error level of 14\% suggests that the model does not fit very well. This is also obvious from the plots of the fit and the residuals. <<L2_SFO_plot, fig=TRUE, echo=TRUE, height=8>>= par(mfrow = c(2, 1)) plot(m.L2.SFO) mkinresplot(m.L2.SFO) @ In the FOCUS kinetics report, it is stated that there is no apparent systematic error observed from the residual plot up to the measured DT90 (approximately at day 5), and there is an underestimation beyond that point. We may add that it is difficult to judge the random nature of the residuals just from the three samplings at days 0, 1 and 3. Also, it is not clear \textit{a priori} why a consistent underestimation after the approximate DT90 should be irrelevant. However, this can be rationalised by the fact that the FOCUS fate models generally only implement SFO kinetics. For comparison, the FOMC model is fitted as well, and the $\chi^2$ error level is checked. <<L2_FOMC, echo=TRUE, fig=TRUE, height=8>>= m.L2.FOMC <- mkinfit(FOMC, FOCUS_2006_L2_mkin, quiet = TRUE) par(mfrow = c(2, 1)) plot(m.L2.FOMC) mkinresplot(m.L2.FOMC) summary(m.L2.FOMC, data = FALSE) @ The error level at which the $\chi^2$ test passes is much lower in this case. Therefore, the FOMC model provides a better description of the data, as less experimental error has to be assumed in order to explain the data. Fitting the four parameter DFOP model further reduces the $\chi^2$ error level. <<L2_DFOP, echo=TRUE, fig=TRUE, height=4>>= m.L2.DFOP <- mkinfit(DFOP, FOCUS_2006_L2_mkin, quiet = TRUE) plot(m.L2.DFOP) @ Here, the default starting parameters for the DFOP model obviously do not lead to a reasonable solution. Therefore the fit is repeated with different starting parameters. <<L2_DFOP_2, echo=TRUE, fig=TRUE, height=4>>= m.L2.DFOP <- mkinfit(DFOP, FOCUS_2006_L2_mkin, parms.ini = c(k1 = 1, k2 = 0.01, g = 0.8), quiet=TRUE) plot(m.L2.DFOP) summary(m.L2.DFOP, data = FALSE) @ Here, the DFOP model is clearly the best-fit model for dataset L2 based on the $\chi^2$ error level criterion. However, the failure to calculate the covariance matrix indicates that the parameter estimates correlate excessively. Therefore, the FOMC model may be preferred for this dataset. \subsection{Laboratory Data L3} The following code defines example dataset L3 from the FOCUS kinetics report, p. 290. <<FOCUS_2006_L3_data, echo=TRUE, eval=TRUE>>= FOCUS_2006_L3 = data.frame( t = c(0, 3, 7, 14, 30, 60, 91, 120), parent = c(97.8, 60, 51, 43, 35, 22, 15, 12)) FOCUS_2006_L3_mkin <- mkin_wide_to_long(FOCUS_2006_L3) @ SFO model, summary and plot: <<L3_SFO, echo=TRUE, fig=TRUE, height=4>>= m.L3.SFO <- mkinfit(SFO, FOCUS_2006_L3_mkin, quiet = TRUE) plot(m.L3.SFO) summary(m.L3.SFO) @ The $\chi^2$ error level of 22\% as well as the plot suggest that the model does not fit very well. The FOMC model performs better: <<L3_FOMC, echo=TRUE, fig=TRUE, height=4>>= m.L3.FOMC <- mkinfit(FOMC, FOCUS_2006_L3_mkin, quiet = TRUE) plot(m.L3.FOMC) summary(m.L3.FOMC, data = FALSE) @ The error level at which the $\chi^2$ test passes is 7\% in this case. Fitting the four parameter DFOP model further reduces the $\chi^2$ error level considerably: <<L3_DFOP, echo=TRUE, fig=TRUE, height=4>>= m.L3.DFOP <- mkinfit(DFOP, FOCUS_2006_L3_mkin, quiet = TRUE) plot(m.L3.DFOP) summary(m.L3.DFOP, data = FALSE) @ Here, a look to the model plot, the confidence intervals of the parameters and the correlation matrix suggest that the parameter estimates are reliable, and the DFOP model can be used as the best-fit model based on the $\chi^2$ error level criterion for laboratory data L3. \subsection{Laboratory Data L4} The following code defines example dataset L4 from the FOCUS kinetics report, p. 293 <<FOCUS_2006_L4_data, echo=TRUE, eval=TRUE>>= FOCUS_2006_L4 = data.frame( t = c(0, 3, 7, 14, 30, 60, 91, 120), parent = c(96.6, 96.3, 94.3, 88.8, 74.9, 59.9, 53.5, 49.0)) FOCUS_2006_L4_mkin <- mkin_wide_to_long(FOCUS_2006_L4) @ SFO model, summary and plot: <<L4_SFO, echo=TRUE, fig=TRUE, height=4>>= m.L4.SFO <- mkinfit(SFO, FOCUS_2006_L4_mkin, quiet = TRUE) plot(m.L4.SFO) summary(m.L4.SFO, data = FALSE) @ The $\chi^2$ error level of 3.3\% as well as the plot suggest that the model fits very well. The FOMC model for comparison <<L4_FOMC, echo=TRUE, fig=TRUE, height=4>>= m.L4.FOMC <- mkinfit(FOMC, FOCUS_2006_L4_mkin, quiet = TRUE) plot(m.L4.FOMC) summary(m.L4.FOMC, data = FALSE) @ The error level at which the $\chi^2$ test passes is slightly lower for the FOMC model. However, the difference appears negligible. \section{Kinetic evaluations for parent and metabolites} \subsection{Laboratory Data for example compound Z} The following code defines the example dataset from Appendix 7 to the FOCUS kinetics report, p.350 <<FOCUS_2006_Z_data, echo=TRUE, eval=TRUE>>= LOD = 0.5 FOCUS_2006_Z = data.frame( t = c(0, 0.04, 0.125, 0.29, 0.54, 1, 2, 3, 4, 7, 10, 14, 21, 42, 61, 96, 124), Z0 = c(100, 81.7, 70.4, 51.1, 41.2, 6.6, 4.6, 3.9, 4.6, 4.3, 6.8, 2.9, 3.5, 5.3, 4.4, 1.2, 0.7), Z1 = c(0, 18.3, 29.6, 46.3, 55.1, 65.7, 39.1, 36, 15.3, 5.6, 1.1, 1.6, 0.6, 0.5 * LOD, NA, NA, NA), Z2 = c(0, NA, 0.5 * LOD, 2.6, 3.8, 15.3, 37.2, 31.7, 35.6, 14.5, 0.8, 2.1, 1.9, 0.5 * LOD, NA, NA, NA), Z3 = c(0, NA, NA, NA, NA, 0.5 * LOD, 9.2, 13.1, 22.3, 28.4, 32.5, 25.2, 17.2, 4.8, 4.5, 2.8, 4.4)) FOCUS_2006_Z_mkin <- mkin_wide_to_long(FOCUS_2006_Z) @ The next step is to set up the models used for the kinetic analysis. As the simultaneous fit of parent and the first metabolite is usually straightforward, Step 1 (SFO for parent only) is skipped here. We start with the model 2a, with formation and decline of metabolite Z1 and the pathway from parent directly to sink included (default in mkin). <<FOCUS_2006_Z_fits_1, echo=TRUE, fig=TRUE, height=4>>= Z.2a <- mkinmod(Z0 = list(type = "SFO", to = "Z1"), Z1 = list(type = "SFO")) m.Z.2a <- mkinfit(Z.2a, FOCUS_2006_Z_mkin, quiet = TRUE) plot(m.Z.2a) summary(m.Z.2a, data = FALSE) @ As obvious from the summary, the kinetic rate constant from parent compound Z to sink is negligible. Accordingly, the exact magnitude of the fitted parameter \texttt{log k\_Z\_sink} is ill-defined and the covariance matrix is not returned. This suggests, in agreement with the analysis in the FOCUS kinetics report, to simplify the model by removing the pathway to sink. A similar result can be obtained when formation fractions are used in the model formulation: <<FOCUS_2006_Z_fits_2, echo=TRUE, fig=TRUE, height=4>>= Z.2a.ff <- mkinmod(Z0 = list(type = "SFO", to = "Z1"), Z1 = list(type = "SFO"), use_of_ff = "max") m.Z.2a.ff <- mkinfit(Z.2a.ff, FOCUS_2006_Z_mkin, quiet = TRUE) plot(m.Z.2a.ff) summary(m.Z.2a.ff, data = FALSE) @ Here, the ilr transformed formation fraction fitted in the model takes a very large value, and the backtransformed formation fraction from parent Z to Z1 is practically unity. Again, the covariance matrix is not returned as the model is overparameterised. The simplified model is obtained by setting the list component \texttt{sink} to \texttt{FALSE}. This model definition is not supported when formation fractions are used. <<FOCUS_2006_Z_fits_3, echo=TRUE, fig=TRUE, height=4>>= Z.3 <- mkinmod(Z0 = list(type = "SFO", to = "Z1", sink = FALSE), Z1 = list(type = "SFO")) m.Z.3 <- mkinfit(Z.3, FOCUS_2006_Z_mkin, parms.ini = c(k_Z0_Z1 = 0.5), quiet = TRUE) #m.Z.3 <- mkinfit(Z.3, FOCUS_2006_Z_mkin, solution_type = "deSolve") plot(m.Z.3) summary(m.Z.3, data = FALSE) @ The first attempt to fit the model failed, as the default solution type chosen by mkinfit is based on eigenvalues, and the system defined by the starting parameters is identified as being singular to the solver. This is caused by the fact that the rate constants for both state variables are the same using the default starting paramters. Setting a different starting value for one of the parameters overcomes this. Alternatively, the \Rpackage{deSolve} based model solution can be chosen, at the cost of a bit more computing time. <<FOCUS_2006_Z_fits_4, echo=TRUE, fig=TRUE, height=4>>= Z.4a <- mkinmod(Z0 = list(type = "SFO", to = "Z1", sink = FALSE), Z1 = list(type = "SFO", to = "Z2"), Z2 = list(type = "SFO")) m.Z.4a <- mkinfit(Z.4a, FOCUS_2006_Z_mkin, parms.ini = c(k_Z0_Z1 = 0.5), quiet = TRUE) plot(m.Z.4a) summary(m.Z.4a, data = FALSE) @ As suggested in the FOCUS report, the pathway to sink was removed for metabolite Z1 as well in the next step. While this step appears questionable on the basis of the above results, it is followed here for the purpose of comparison. Also, in the FOCUS report, it is assumed that there is additional empirical evidence that Z1 quickly and exclusively hydrolyses to Z2. Again, in order to avoid a singular system when using default starting parameters, the starting parameter for the pathway without sink term has to be adapted. <<FOCUS_2006_Z_fits_5, echo=TRUE, fig=TRUE, height=4>>= Z.5 <- mkinmod(Z0 = list(type = "SFO", to = "Z1", sink = FALSE), Z1 = list(type = "SFO", to = "Z2", sink = FALSE), Z2 = list(type = "SFO")) m.Z.5 <- mkinfit(Z.5, FOCUS_2006_Z_mkin, parms.ini = c(k_Z0_Z1 = 0.5, k_Z1_Z2 = 0.2), quiet = TRUE) plot(m.Z.5) summary(m.Z.5, data = FALSE) @ Finally, metabolite Z3 is added to the model. <<FOCUS_2006_Z_fits_6, echo=TRUE, fig=TRUE, height=4>>= Z.FOCUS <- mkinmod(Z0 = list(type = "SFO", to = "Z1", sink = FALSE), Z1 = list(type = "SFO", to = "Z2", sink = FALSE), Z2 = list(type = "SFO", to = "Z3"), Z3 = list(type = "SFO")) m.Z.FOCUS <- mkinfit(Z.FOCUS, FOCUS_2006_Z_mkin, parms.ini = c(k_Z0_Z1 = 0.5, k_Z1_Z2 = 0.2, k_Z2_Z3 = 0.3), quiet = TRUE) plot(m.Z.FOCUS) summary(m.Z.FOCUS, data = FALSE) @ This is the fit corresponding to the final result chosen in Appendix 7 of the FOCUS report. The residual plots can be obtained by <<FOCUS_2006_Z_residuals_6, echo=TRUE, fig=TRUE>>= par(mfrow = c(2, 2)) mkinresplot(m.Z.FOCUS, "Z0", lpos = "bottomright") mkinresplot(m.Z.FOCUS, "Z1", lpos = "bottomright") mkinresplot(m.Z.FOCUS, "Z2", lpos = "bottomright") mkinresplot(m.Z.FOCUS, "Z3", lpos = "bottomright") @ As the FOCUS report states, there is a certain tailing of the time course of metabolite Z3. Also, the time course of the parent compound is not fitted very well using the SFO model, as residues at a certain low level remain. Therefore, an additional model is offered here, using the single first-order reversible binding (SFORB) model for metabolite Z3. As expected, the $\chi^2$ error level is lower for metabolite Z3 using this model and the graphical fit for Z3 is improved. However, the covariance matrix is not returned. <<FOCUS_2006_Z_fits_7, echo=TRUE, fig=TRUE, height=4>>= Z.mkin.1 <- mkinmod(Z0 = list(type = "SFO", to = "Z1", sink = FALSE), Z1 = list(type = "SFO", to = "Z2", sink = FALSE), Z2 = list(type = "SFO", to = "Z3"), Z3 = list(type = "SFORB")) m.Z.mkin.1 <- mkinfit(Z.mkin.1, FOCUS_2006_Z_mkin, parms.ini = c(k_Z0_Z1 = 0.5, k_Z1_Z2 = 0.3), quiet = TRUE) plot(m.Z.mkin.1) summary(m.Z.mkin.1, data = FALSE) @ Therefore, a further stepwise model building is performed starting from the stage of parent and one metabolite, starting from the assumption that the model fit for the parent compound can be improved by using the SFORB model. <<FOCUS_2006_Z_fits_8, echo=TRUE, fig=TRUE, height=4>>= Z.mkin.2 <- mkinmod(Z0 = list(type = "SFORB", to = "Z1", sink = FALSE), Z1 = list(type = "SFO")) m.Z.mkin.2 <- mkinfit(Z.mkin.2, FOCUS_2006_Z_mkin, quiet = TRUE) plot(m.Z.mkin.2) summary(m.Z.mkin.2, data = FALSE) @ When metabolite Z2 is added, the additional sink for Z1 is turned off again, for the same reasons as in the original analysis. <<FOCUS_2006_Z_fits_9, echo=TRUE, fig=TRUE, height=4>>= Z.mkin.3 <- mkinmod(Z0 = list(type = "SFORB", to = "Z1", sink = FALSE), Z1 = list(type = "SFO", to = "Z2"), Z2 = list(type = "SFO")) m.Z.mkin.3 <- mkinfit(Z.mkin.3, FOCUS_2006_Z_mkin, quiet = TRUE) plot(m.Z.mkin.3) summary(m.Z.mkin.3, data = FALSE) @ This results in a much better representation of the behaviour of the parent compound Z0. Finally, Z3 is added as well. This model appears overparameterised (no covariance matrix returned) if the sink for Z1 is left in the model. <<FOCUS_2006_Z_fits_10, echo=TRUE, fig=TRUE, height=4>>= Z.mkin.4 <- mkinmod(Z0 = list(type = "SFORB", to = "Z1", sink = FALSE), Z1 = list(type = "SFO", to = "Z2", sink = FALSE), Z2 = list(type = "SFO", to = "Z3"), Z3 = list(type = "SFO")) m.Z.mkin.4 <- mkinfit(Z.mkin.4, FOCUS_2006_Z_mkin, parms.ini = c(k_Z1_Z2 = 0.05), quiet = TRUE) plot(m.Z.mkin.4) summary(m.Z.mkin.4, data = FALSE) @ The error level of the fit, but especially of metabolite Z3, can be improved if the SFORB model is chosen for this metabolite, as this model is capable of representing the tailing of the metabolite decline phase. Using the SFORB additionally for Z1 or Z2 did not further improve the result. <<FOCUS_2006_Z_fits_11, echo=TRUE, fig=TRUE, height=4>>= Z.mkin.5 <- mkinmod(Z0 = list(type = "SFORB", to = "Z1", sink = FALSE), Z1 = list(type = "SFO", to = "Z2", sink = FALSE), Z2 = list(type = "SFO", to = "Z3"), Z3 = list(type = "SFORB")) m.Z.mkin.5 <- mkinfit(Z.mkin.5, FOCUS_2006_Z_mkin, parms.ini = c(k_Z1_Z2 = 0.2), quiet = TRUE) plot(m.Z.mkin.5) summary(m.Z.mkin.5, data = FALSE) @ Looking at the confidence intervals of the SFORB model parameters of Z3, it is clear that nothing can be said about the degradation rate of Z3 towards the end of the experiment. However, this appears to be a feature of the data. <<FOCUS_2006_Z_residuals_11, fig=TRUE>>= par(mfrow = c(2, 2)) mkinresplot(m.Z.mkin.5, "Z0", lpos = "bottomright") mkinresplot(m.Z.mkin.5, "Z1", lpos = "bottomright") mkinresplot(m.Z.mkin.5, "Z2", lpos = "bottomright") mkinresplot(m.Z.mkin.5, "Z3", lpos = "bottomright") @ As expected, the residual plots are much more random than in the case of the all SFO model for which they were shown above. In conclusion, the model \texttt{Z.mkin.5} is proposed as the best-fit model for the dataset from Appendix 7 of the FOCUS report. \bibliographystyle{plainnat} \bibliography{references} \end{document} % vim: set foldmethod=syntax: