Added a manual, small GUI improvements

author: Johannes Ranke <jranke@uni-bremen.de> 2014-10-10 22:41:34 +0200
committer: Johannes Ranke <jranke@uni-bremen.de> 2014-10-10 22:41:34 +0200
commit: 99f93922f42af6cadd5c7ff4b0f1a96c830f9fb7 (patch)
tree: 00e2a6fc4e0c301721bc8307097a2b76917c9a57 /vignettes/gmkin_manual.Rmd
parent: 813b2bba907bf2a62e1d5d58e9f362c09da7dfc1 (diff)
1 files changed, 387 insertions, 0 deletions
diff --git a/vignettes/gmkin_manual.Rmd b/vignettes/gmkin_manual.Rmd
new file mode 100644
index 0000000..ef40e75
--- /dev/null
+++ b/vignettes/gmkin_manual.Rmd
@@ -0,0 +1,387 @@
+<!--
+%\VignetteEngine{knitr::knitr}
+%\VignetteIndexEntry{Manual for gmkin}
+-->
+
+```{r, include = FALSE}
+library(knitr)
+opts_chunk$set(tidy = FALSE, cache = TRUE)
+```
+# Manual for gmkin
+
+## Introduction
+
+The R add-on package gmkin provides a browser based graphical interface for
+performing kinetic evaluations of degradation data using the mkin package.
+While the use of gmkin should be largely self-explanatory, this manual may serve
+as a functionality overview and reference.
+
+For system requirements and installation instructions, please refer to the 
+[gmkin homepage](http://kinfit.r-forge.r-project.org/gmkin_static)
+
+## Starting gmkin
+
+As gmkin is an R package, you need to start R and load the gmkin package before you can run gmkin. 
+This can be achieved by entering the command
+
+```{r, eval = FALSE}
+library(gmkin)
+```
+
+into the R console. This will also load the packages that gmkin depends on,
+most notably gWidgetsWWW2 and mkin.  Loading the package only has to be done
+once after you have started R. 
+
+Before you start gmkin, you should make sure that R is using the working
+directory that you would like to keep your gmkin project file(s) in. If you use
+the standard R application on windows, you can change the working directory
+from the File menu.
+
+Once you are sure that the working directory is what you want it to be, gmkin
+can be started by entering the R command
+
+```{r, eval = FALSE}
+gmkin()
+```
+
+This will cause the default browser to start up or, if it is already running, to
+pop up and open a new tab for displaying the gmkin user interface.
+
+In the R console, you should see some messages, telling you if the local R help
+server, which also serves the gmkin application, has been started, which port it is
+using and that it is starting an app called gmkin.
+
+Finally, it should give a message like
+
+```{r, eval = FALSE}
+Model cost at call 1: 2388.077
+```
+
+which means that the first kinetic evaluation has been configured for fitting.
+
+In the browser, you should see something like the screenshot below.
+
+![gmkin start](img/gmkin_start.png)
+
+The statusbar at the bottom of the gmkin window shows, among others, the
+working directory that gmkin uses.
+
+Note that the project file management area described below can be minimized by clicking on 
+the arrows on the right hand side of its title bar. This may be helpful if the vertical
+size of your browser window is restricted.
+
+## Project file management
+
+At startup, gmkin loads a project called "FOCUS\_2006\_gmkin" which is distributed
+with the gmkin package. A gmkin project contains datasets, kinetic models for
+fitting, and so-called fits, i.e. the results of fitting models to data. These
+gmkin projects can be saved and restored using the project file management area in the
+top left.
+
+![projects](img/projects.png)
+
+If you would like to save these items for reference or for the purpose of continuing 
+your work at a later time, you can modify the project name and press the button below it.
+The full name of the project file created and the working directory will be displayed 
+in the gmkin status bar.
+
+For restoring a previously saved project file, use the Browse button to locate
+it, and the "Upload" button to load its contents into gmkin.
+
+## Studies
+
+The "Studies" area directly below the "Project file management" area can be expanded by clicking
+on the arrows on the right hand side of its title bar. Studies in gmkin are
+simply a numbered list of sources for the datasets in a project. You can edit the titles 
+directly by clicking on them. If you would like to add a new data source, use the "Add"
+button above the table containing the list. If there are more than one studies in the list,
+you can also remove them using the "Remove" button.
+
+![studies](img/studies.png)
+
+Note that the user is responsible to keep the study list consistent with the numbers that are 
+used in the list of datasets described below.
+
+## Datasets and Models
+The project loaded at the start of gmkin contains two datasets and four kinetic models. These 
+are listed to the left under the heading "Datasets and Models", together with a button for 
+setting up fits as shown below.
+
+![datasets and models](img/datasetsnmodels.png)
+
+For editing, adding or removing datasets or models, you need to click on an
+entry in the respective list. 
+
+For setting up a fit of a specific model to a specific dataset, the model and
+the dataset should be selected by clicking on them. If they are compatible, clicking
+the button "Configure fit for selected dataset and model" will set up the fit and 
+open the "Plotting and Fitting" tab to the right.
+
+## Dataset editor
+
+The dataset editor allows for editing datasets, entering new datasets, uploading
+data from text files and deleting datasets.
+
+![dataset editor](img/dataseteditor.png)
+
+If you want to create (enter or load) a new dataset, it is wise to first edit
+the list of data sources in the "Studies" area as described above.
+
+### Entering data directly
+
+For entering new data manually, click on "New dataset", enter a title and select
+the study from which the dataset is taken. At this stage, you may already want
+to press "Keep changes", so the dataset appears in the list of datasets.
+
+In order to generate a table suitable for entering the data, enter a comma separated
+list of sampling times, optionally the time unit, and the number of replicate measurements
+at each sampling time. Also, add a comma separated list of short names of the
+relevant compounds in your dataset. A unit can be specified for the observed 
+values. An example of filling out the respective fields is shown below.
+
+![generate data grid](img/generatedatagrid.png)
+
+Once everyting is filled out to your satisfaction, press the button "Generate empty grid
+for kinetic data". In our example, this would result in the data grid shown below. You 
+can enter the observed data into the value column, as shown in the screenshot below.
+
+![data grid](img/datagrid.png)
+
+The column with title override serves to override data points from the original
+datasets, without loosing the information which value was originally reported.
+
+If everything is OK, press "Keep changes" to save the dataset in the current
+workspace. Note that you need to save the project file (see above) in order to
+be able to use the dataset that you created in a future gmkin session.
+
+### Entering data directly
+
+In case you want to work with a larger dataset that is already available as a computer 
+file e.g. in a spreadsheet application, you can export these data as a tab separated
+or comma separated text file and import it using the "Browse" and "Upload" buttons in the
+dataset editor.
+
+As an example, we can create a text file from one of the datasets shipped with
+the mkin package using the following R command:
+
+```{r, eval = FALSE}
+write.table(schaefer07_complex_case, sep = ",", dec = ".", 
+            row.names = FALSE, quote = FALSE, 
+            file = "schaefer07.csv")
+```
+
+This produces a text file with comma separated values in the current working directory of R. 
+
+Loading this text file into gmkin using the "Browse" and "Upload" buttons results in 
+an import configuration area like this, with the uploaded text file displayed to the left,
+and the import options to the right.
+
+![upload area](img/uploadarea.png)
+
+In the import configuration area, the following options can be specified. In the field
+"Comment lines", the number of lines in the beginning of the file that should be ignored
+can be specified.
+
+The checkbox on the next line should be checked if the first line of the file contains
+the column names, i.e. the names of the observed variables when the data are in wide format.
+
+As "Separator", whitespace, semicolon or comma can be chosen. If whitespace is selected,
+files in which the values are separated by a fixed or varying number of whitespace characters
+should be read in correctly. As the tabulator counts as a whitespace character, this is 
+also the option to choose for tabulator separated values.
+
+As the "Decimal" separator, comma "," or period "." can be selected.
+
+In the next line, it can be specified if the data are in wide or in long format.
+If in wide format, the only option left to specify is the title of the column containing
+the sampling times. If the data is in long format, the column headings specifying the 
+columns containing the observed variables (default is "name"), the sampling times
+(default is "time"), the observed values (default is "value") and, if present in the data,
+the relative errors (default is "err") can be adapted. The default settings appearing if
+the long format is selected are shown below.
+
+![long](img/long.png)
+
+In our example we have data in the wide format, and after adapting the
+"Separator" to a comma, we can press the button "Import using options specified
+below", and the data should be imported.  If successful, the data editor should
+show the sampling times and the names of the observed variables, as well as the
+imported data in a grid for further editing or specifying overrides.
+
+After editing the title of the dataset and selecting the correct study as 
+the source of the data, the dataset editor should look like shown below.
+
+![successful upload](img/successfulupload.png)
+
+If everything is OK, press "Keep changes" to save the dataset in the current
+workspace. Again, you need to save the project file in order to be able to use
+the dataset that you created in a future gmkin session.
+
+## Model editor
+
+The following screenshot shows the model editor for the model number 4 in 
+the list of models that are in the initial workspace.
+
+![model editor](img/modeleditor.png)
+
+In the first line the name of the model can be edited. You can also specify "min" or
+"max" for minimum or maximum use of formation fractions. Maximum use of formation
+fractions means that the differential equations in the degradation model are formulated
+using formation fractions. When you specify "min", then formation fractions are only used
+for the parent compound when you use the FOMC, DFOP or the HS model for it.
+
+Pressing "Copy model" keeps the model name, so you should change it for the newly generated copy.
+Pressing "Add observed variable" adds a line in the array of state variable specifications below.
+The observed variables to be added are usually transformation products (usually termed metabolites),
+but can also be the parent compound in a different compartment (e.g. "parent\_sediment").
+
+Only observed variable names that occur in previously defined datasets can be selected. For any observed 
+variable other than the first one, only the SFO or the SFORB model can be selected. For each 
+observed variables, a comma separated list of target variables can be specified. In addition, a pathway
+to the sink compartment can be selected. If too many observed variables have been added, complete lines
+can be removed from the model definition by pressing the button "Remove observed variable".
+
+If the model definition is supposedly correct, press "Keep changes" to make it possible to select
+it for fitting in the listing of models to the left.
+
+## Plotting and fitting
+
+If the dataset(s) to be used in a project are created, and suitable kinetic models have been defined,
+kinetic evaluations can be configured by selecting one dataset and one model in the lists to the left,
+and the pressing the button "Configure fit for selected dataset and model" below these lists.
+
+This opens the "Plotting and fitting" tab area to the right, consisting of a graphical window
+showing the data points in the selected dataset and the model, evaluated with the initial parameters
+defined by calling `mkinfit` without defining starting parameters. The value of the objective function
+to be minimized for these default parameters can be seen in the R console, e.g. as
+
+```{r, eval = FALSE}
+Model cost at call 1: 15156.12
+```
+
+for the example shown below, where the FOCUS example dataset D and the model SFO\_SFO were selected.
+
+![plotting and fitting](img/plottingnfitting.png)
+
+### Parameters
+
+In the right hand area, initially the tab with the parameter list is displayed. While 
+name and type of the parameters should not be edited, their initial values can be edited
+by clicking on a row. Also, it can be specified if the parameters should be fixed
+in the optimisation process. 
+
+If the initial values for the parameters were changed, the resulting model solution
+can be visually checked by pressing the button "Show initial". This will update the 
+plot of the model and the data using the specified initial parameter values.
+
+If a similar model with a partially overlapping model definition has already be fitted, 
+initial values for parameters with the same name in both models can also be retrieved
+from previous fits. This facilitates stepwise fitting of more complex degradation pathways.
+
+After the model has been successfully fitted by pressing the "Run" button, the optimised
+parameter values are added to the parameter table.
+
+### Fit options
+
+The most important fit options of the `mkinfit` function can be set via the 
+"Fit option" tab shown below. If the "plot" checkbox is checked, an R graphics device
+started via the R console shows the fitting progress, i.e. the change of the model
+solution together with the data during the optimisation.
+
+![fit options](img/fitoptions.png)
+
+The "solution\_type" can either be "auto", which means that the most effective solution
+method is chosen for the model, in the order of "analytical" (for parent only degradation
+data), "eigen" (for differential equation models with only linear terms, i.e. without 
+FOMC, DFOP or HS submodels) or "deSolve", which can handle all model definitions generated
+by the `mkin` package.
+
+The parameters "atol" and "rtol" are only effective if the solution type is "deSolve". They
+control the precision of the iterative numerical solution of the differential equation model.
+
+The checkboxes "transform\_rates" and "transform\_fractions" control if the parameters are fitted
+as defined in the model, or if they are internally transformed during the fitting process in 
+order to improve the estimation of standard errors and confidence intervals which are based 
+on a linear approximation at the optimum found by the fitting algorithm.
+
+The dropdown box "weight" specifies if and how the observed values should be weighted 
+in the fitting process. If "manual" is chosen, the values in the "err" column of the 
+dataset are used, which are set to unity by default. Setting these to higher values 
+gives lower weight and vice versa. If "none" is chosen, observed
+values are not weighted. Please refer to the documentation of the `modFit` function from
+the `FME` package for the meaning of options "std" and "mean".
+
+The options "reweight.method", "reweight.tol" and "reweight.max.iter" enable the use of
+iteratively reweighted least squares fitting, if the reweighting method is set to "obs". Please
+refer to the `mkinfit` [documentation](http://kinfit.r-forge.r-project.org/mkin_static/mkinfit.html) 
+for more details.
+
+The drop down box "method.modFit" makes it possible to choose between the optimisation
+algorithms "Marq" (the default Levenberg-Marquardt implementation from the R package 
+`minpack.lm`), "Port" (an alternative, also local optimisation algorithm) and
+"SANN" (the simulated annealing method - robust but inefficient and without a
+convergence criterion).
+
+Finally, the maximum number of iterations for the optimisation can be adapted using the
+"maxit.modFit" field.
+
+### Fitting the model
+
+In many cases the starting parameters and the fit options do not need to be modified
+and the model fitting process can simply be started by pressing the "Run" button.
+In the R console, the progressive reduction in the model cost can be monitored and will 
+be displayed like this:
+
+```{r, eval = FALSE}
+Model cost at call  1 :  15156.12 
+Model cost at call  3 :  15156.12 
+Model cost at call  7 :  14220.79 
+Model cost at call  8 :  14220.79 
+Model cost at call  11 :  14220.79 
+Model cost at call  12 :  3349.268 
+Model cost at call  15 :  3349.268 
+Model cost at call  17 :  788.6367 
+Model cost at call  18 :  788.6366 
+Model cost at call  22 :  374.0575 
+Model cost at call  23 :  374.0575 
+Model cost at call  27 :  371.2135 
+Model cost at call  28 :  371.2135 
+Model cost at call  32 :  371.2134 
+Model cost at call  36 :  371.2134 
+Model cost at call  37 :  371.2134 
+```
+
+If plotting of the fitting progress was selecte in the "Fit options" tab, a
+new separate graphics window should either pop up, or a graphics window previously
+started for this purpose will be reused.
+
+### Summary
+
+Once a fit has successfully been performed by pressing the "Run" button, the summary
+as displayed below can be accessed via the "Summary" tab.
+
+![summary](img/summary.png)
+
+The complete summary can be saved into a text file by specifying a suitable file name
+and pressing the button "Save summary".
+
+### Plot options
+
+In the tab "Plot options", the observed variables for which the data and the model fit should be 
+plotted can be selected as shown below.
+
+![plot options](img/plotoptions.png)
+
+### Confidence interval plots
+
+Whenever a new fit has been configured or a run of a fit has been completed, the plotting
+area is update with the abovementioned plot of the data and the current model solution.
+
+In addition, a confidence interval plot is shown below this conventional plot. In case
+a fit has been run and confidence intervals were successfully calculated for the fit (i.e. 
+if the model was not overparameterised and no other problems occurred), the 
+confidence intervals are graphically displayed as bars as shown below.
+
+![conficence](img/confidence.png)
+
+<!-- vim: set foldmethod=syntax ts=2 sw=2 expandtab: -->
author	Johannes Ranke <jranke@uni-bremen.de>	2014-10-10 22:41:34 +0200
committer	Johannes Ranke <jranke@uni-bremen.de>	2014-10-10 22:41:34 +0200
commit	99f93922f42af6cadd5c7ff4b0f1a96c830f9fb7 (patch)
tree	00e2a6fc4e0c301721bc8307097a2b76917c9a57 /vignettes/gmkin_manual.Rmd
parent	813b2bba907bf2a62e1d5d58e9f362c09da7dfc1 (diff)