diff options
Diffstat (limited to 'vignettes/gmkin_manual.Rmd')
-rw-r--r-- | vignettes/gmkin_manual.Rmd | 387 |
1 files changed, 387 insertions, 0 deletions
diff --git a/vignettes/gmkin_manual.Rmd b/vignettes/gmkin_manual.Rmd new file mode 100644 index 0000000..ef40e75 --- /dev/null +++ b/vignettes/gmkin_manual.Rmd @@ -0,0 +1,387 @@ +<!-- +%\VignetteEngine{knitr::knitr} +%\VignetteIndexEntry{Manual for gmkin} +--> + +```{r, include = FALSE} +library(knitr) +opts_chunk$set(tidy = FALSE, cache = TRUE) +``` +# Manual for gmkin + +## Introduction + +The R add-on package gmkin provides a browser based graphical interface for +performing kinetic evaluations of degradation data using the mkin package. +While the use of gmkin should be largely self-explanatory, this manual may serve +as a functionality overview and reference. + +For system requirements and installation instructions, please refer to the +[gmkin homepage](http://kinfit.r-forge.r-project.org/gmkin_static) + +## Starting gmkin + +As gmkin is an R package, you need to start R and load the gmkin package before you can run gmkin. +This can be achieved by entering the command + +```{r, eval = FALSE} +library(gmkin) +``` + +into the R console. This will also load the packages that gmkin depends on, +most notably gWidgetsWWW2 and mkin. Loading the package only has to be done +once after you have started R. + +Before you start gmkin, you should make sure that R is using the working +directory that you would like to keep your gmkin project file(s) in. If you use +the standard R application on windows, you can change the working directory +from the File menu. + +Once you are sure that the working directory is what you want it to be, gmkin +can be started by entering the R command + +```{r, eval = FALSE} +gmkin() +``` + +This will cause the default browser to start up or, if it is already running, to +pop up and open a new tab for displaying the gmkin user interface. + +In the R console, you should see some messages, telling you if the local R help +server, which also serves the gmkin application, has been started, which port it is +using and that it is starting an app called gmkin. + +Finally, it should give a message like + +```{r, eval = FALSE} +Model cost at call 1: 2388.077 +``` + +which means that the first kinetic evaluation has been configured for fitting. + +In the browser, you should see something like the screenshot below. + +![gmkin start](img/gmkin_start.png) + +The statusbar at the bottom of the gmkin window shows, among others, the +working directory that gmkin uses. + +Note that the project file management area described below can be minimized by clicking on +the arrows on the right hand side of its title bar. This may be helpful if the vertical +size of your browser window is restricted. + +## Project file management + +At startup, gmkin loads a project called "FOCUS\_2006\_gmkin" which is distributed +with the gmkin package. A gmkin project contains datasets, kinetic models for +fitting, and so-called fits, i.e. the results of fitting models to data. These +gmkin projects can be saved and restored using the project file management area in the +top left. + +![projects](img/projects.png) + +If you would like to save these items for reference or for the purpose of continuing +your work at a later time, you can modify the project name and press the button below it. +The full name of the project file created and the working directory will be displayed +in the gmkin status bar. + +For restoring a previously saved project file, use the Browse button to locate +it, and the "Upload" button to load its contents into gmkin. + +## Studies + +The "Studies" area directly below the "Project file management" area can be expanded by clicking +on the arrows on the right hand side of its title bar. Studies in gmkin are +simply a numbered list of sources for the datasets in a project. You can edit the titles +directly by clicking on them. If you would like to add a new data source, use the "Add" +button above the table containing the list. If there are more than one studies in the list, +you can also remove them using the "Remove" button. + +![studies](img/studies.png) + +Note that the user is responsible to keep the study list consistent with the numbers that are +used in the list of datasets described below. + +## Datasets and Models +The project loaded at the start of gmkin contains two datasets and four kinetic models. These +are listed to the left under the heading "Datasets and Models", together with a button for +setting up fits as shown below. + +![datasets and models](img/datasetsnmodels.png) + +For editing, adding or removing datasets or models, you need to click on an +entry in the respective list. + +For setting up a fit of a specific model to a specific dataset, the model and +the dataset should be selected by clicking on them. If they are compatible, clicking +the button "Configure fit for selected dataset and model" will set up the fit and +open the "Plotting and Fitting" tab to the right. + +## Dataset editor + +The dataset editor allows for editing datasets, entering new datasets, uploading +data from text files and deleting datasets. + +![dataset editor](img/dataseteditor.png) + +If you want to create (enter or load) a new dataset, it is wise to first edit +the list of data sources in the "Studies" area as described above. + +### Entering data directly + +For entering new data manually, click on "New dataset", enter a title and select +the study from which the dataset is taken. At this stage, you may already want +to press "Keep changes", so the dataset appears in the list of datasets. + +In order to generate a table suitable for entering the data, enter a comma separated +list of sampling times, optionally the time unit, and the number of replicate measurements +at each sampling time. Also, add a comma separated list of short names of the +relevant compounds in your dataset. A unit can be specified for the observed +values. An example of filling out the respective fields is shown below. + +![generate data grid](img/generatedatagrid.png) + +Once everyting is filled out to your satisfaction, press the button "Generate empty grid +for kinetic data". In our example, this would result in the data grid shown below. You +can enter the observed data into the value column, as shown in the screenshot below. + +![data grid](img/datagrid.png) + +The column with title override serves to override data points from the original +datasets, without loosing the information which value was originally reported. + +If everything is OK, press "Keep changes" to save the dataset in the current +workspace. Note that you need to save the project file (see above) in order to +be able to use the dataset that you created in a future gmkin session. + +### Entering data directly + +In case you want to work with a larger dataset that is already available as a computer +file e.g. in a spreadsheet application, you can export these data as a tab separated +or comma separated text file and import it using the "Browse" and "Upload" buttons in the +dataset editor. + +As an example, we can create a text file from one of the datasets shipped with +the mkin package using the following R command: + +```{r, eval = FALSE} +write.table(schaefer07_complex_case, sep = ",", dec = ".", + row.names = FALSE, quote = FALSE, + file = "schaefer07.csv") +``` + +This produces a text file with comma separated values in the current working directory of R. + +Loading this text file into gmkin using the "Browse" and "Upload" buttons results in +an import configuration area like this, with the uploaded text file displayed to the left, +and the import options to the right. + +![upload area](img/uploadarea.png) + +In the import configuration area, the following options can be specified. In the field +"Comment lines", the number of lines in the beginning of the file that should be ignored +can be specified. + +The checkbox on the next line should be checked if the first line of the file contains +the column names, i.e. the names of the observed variables when the data are in wide format. + +As "Separator", whitespace, semicolon or comma can be chosen. If whitespace is selected, +files in which the values are separated by a fixed or varying number of whitespace characters +should be read in correctly. As the tabulator counts as a whitespace character, this is +also the option to choose for tabulator separated values. + +As the "Decimal" separator, comma "," or period "." can be selected. + +In the next line, it can be specified if the data are in wide or in long format. +If in wide format, the only option left to specify is the title of the column containing +the sampling times. If the data is in long format, the column headings specifying the +columns containing the observed variables (default is "name"), the sampling times +(default is "time"), the observed values (default is "value") and, if present in the data, +the relative errors (default is "err") can be adapted. The default settings appearing if +the long format is selected are shown below. + +![long](img/long.png) + +In our example we have data in the wide format, and after adapting the +"Separator" to a comma, we can press the button "Import using options specified +below", and the data should be imported. If successful, the data editor should +show the sampling times and the names of the observed variables, as well as the +imported data in a grid for further editing or specifying overrides. + +After editing the title of the dataset and selecting the correct study as +the source of the data, the dataset editor should look like shown below. + +![successful upload](img/successfulupload.png) + +If everything is OK, press "Keep changes" to save the dataset in the current +workspace. Again, you need to save the project file in order to be able to use +the dataset that you created in a future gmkin session. + +## Model editor + +The following screenshot shows the model editor for the model number 4 in +the list of models that are in the initial workspace. + +![model editor](img/modeleditor.png) + +In the first line the name of the model can be edited. You can also specify "min" or +"max" for minimum or maximum use of formation fractions. Maximum use of formation +fractions means that the differential equations in the degradation model are formulated +using formation fractions. When you specify "min", then formation fractions are only used +for the parent compound when you use the FOMC, DFOP or the HS model for it. + +Pressing "Copy model" keeps the model name, so you should change it for the newly generated copy. +Pressing "Add observed variable" adds a line in the array of state variable specifications below. +The observed variables to be added are usually transformation products (usually termed metabolites), +but can also be the parent compound in a different compartment (e.g. "parent\_sediment"). + +Only observed variable names that occur in previously defined datasets can be selected. For any observed +variable other than the first one, only the SFO or the SFORB model can be selected. For each +observed variables, a comma separated list of target variables can be specified. In addition, a pathway +to the sink compartment can be selected. If too many observed variables have been added, complete lines +can be removed from the model definition by pressing the button "Remove observed variable". + +If the model definition is supposedly correct, press "Keep changes" to make it possible to select +it for fitting in the listing of models to the left. + +## Plotting and fitting + +If the dataset(s) to be used in a project are created, and suitable kinetic models have been defined, +kinetic evaluations can be configured by selecting one dataset and one model in the lists to the left, +and the pressing the button "Configure fit for selected dataset and model" below these lists. + +This opens the "Plotting and fitting" tab area to the right, consisting of a graphical window +showing the data points in the selected dataset and the model, evaluated with the initial parameters +defined by calling `mkinfit` without defining starting parameters. The value of the objective function +to be minimized for these default parameters can be seen in the R console, e.g. as + +```{r, eval = FALSE} +Model cost at call 1: 15156.12 +``` + +for the example shown below, where the FOCUS example dataset D and the model SFO\_SFO were selected. + +![plotting and fitting](img/plottingnfitting.png) + +### Parameters + +In the right hand area, initially the tab with the parameter list is displayed. While +name and type of the parameters should not be edited, their initial values can be edited +by clicking on a row. Also, it can be specified if the parameters should be fixed +in the optimisation process. + +If the initial values for the parameters were changed, the resulting model solution +can be visually checked by pressing the button "Show initial". This will update the +plot of the model and the data using the specified initial parameter values. + +If a similar model with a partially overlapping model definition has already be fitted, +initial values for parameters with the same name in both models can also be retrieved +from previous fits. This facilitates stepwise fitting of more complex degradation pathways. + +After the model has been successfully fitted by pressing the "Run" button, the optimised +parameter values are added to the parameter table. + +### Fit options + +The most important fit options of the `mkinfit` function can be set via the +"Fit option" tab shown below. If the "plot" checkbox is checked, an R graphics device +started via the R console shows the fitting progress, i.e. the change of the model +solution together with the data during the optimisation. + +![fit options](img/fitoptions.png) + +The "solution\_type" can either be "auto", which means that the most effective solution +method is chosen for the model, in the order of "analytical" (for parent only degradation +data), "eigen" (for differential equation models with only linear terms, i.e. without +FOMC, DFOP or HS submodels) or "deSolve", which can handle all model definitions generated +by the `mkin` package. + +The parameters "atol" and "rtol" are only effective if the solution type is "deSolve". They +control the precision of the iterative numerical solution of the differential equation model. + +The checkboxes "transform\_rates" and "transform\_fractions" control if the parameters are fitted +as defined in the model, or if they are internally transformed during the fitting process in +order to improve the estimation of standard errors and confidence intervals which are based +on a linear approximation at the optimum found by the fitting algorithm. + +The dropdown box "weight" specifies if and how the observed values should be weighted +in the fitting process. If "manual" is chosen, the values in the "err" column of the +dataset are used, which are set to unity by default. Setting these to higher values +gives lower weight and vice versa. If "none" is chosen, observed +values are not weighted. Please refer to the documentation of the `modFit` function from +the `FME` package for the meaning of options "std" and "mean". + +The options "reweight.method", "reweight.tol" and "reweight.max.iter" enable the use of +iteratively reweighted least squares fitting, if the reweighting method is set to "obs". Please +refer to the `mkinfit` [documentation](http://kinfit.r-forge.r-project.org/mkin_static/mkinfit.html) +for more details. + +The drop down box "method.modFit" makes it possible to choose between the optimisation +algorithms "Marq" (the default Levenberg-Marquardt implementation from the R package +`minpack.lm`), "Port" (an alternative, also local optimisation algorithm) and +"SANN" (the simulated annealing method - robust but inefficient and without a +convergence criterion). + +Finally, the maximum number of iterations for the optimisation can be adapted using the +"maxit.modFit" field. + +### Fitting the model + +In many cases the starting parameters and the fit options do not need to be modified +and the model fitting process can simply be started by pressing the "Run" button. +In the R console, the progressive reduction in the model cost can be monitored and will +be displayed like this: + +```{r, eval = FALSE} +Model cost at call 1 : 15156.12 +Model cost at call 3 : 15156.12 +Model cost at call 7 : 14220.79 +Model cost at call 8 : 14220.79 +Model cost at call 11 : 14220.79 +Model cost at call 12 : 3349.268 +Model cost at call 15 : 3349.268 +Model cost at call 17 : 788.6367 +Model cost at call 18 : 788.6366 +Model cost at call 22 : 374.0575 +Model cost at call 23 : 374.0575 +Model cost at call 27 : 371.2135 +Model cost at call 28 : 371.2135 +Model cost at call 32 : 371.2134 +Model cost at call 36 : 371.2134 +Model cost at call 37 : 371.2134 +``` + +If plotting of the fitting progress was selecte in the "Fit options" tab, a +new separate graphics window should either pop up, or a graphics window previously +started for this purpose will be reused. + +### Summary + +Once a fit has successfully been performed by pressing the "Run" button, the summary +as displayed below can be accessed via the "Summary" tab. + +![summary](img/summary.png) + +The complete summary can be saved into a text file by specifying a suitable file name +and pressing the button "Save summary". + +### Plot options + +In the tab "Plot options", the observed variables for which the data and the model fit should be +plotted can be selected as shown below. + +![plot options](img/plotoptions.png) + +### Confidence interval plots + +Whenever a new fit has been configured or a run of a fit has been completed, the plotting +area is update with the abovementioned plot of the data and the current model solution. + +In addition, a confidence interval plot is shown below this conventional plot. In case +a fit has been run and confidence intervals were successfully calculated for the fit (i.e. +if the model was not overparameterised and no other problems occurred), the +confidence intervals are graphically displayed as bars as shown below. + +![conficence](img/confidence.png) + +<!-- vim: set foldmethod=syntax ts=2 sw=2 expandtab: --> |