summaryrefslogtreecommitdiff
path: root/docs/index.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/index.md')
-rw-r--r--docs/index.md181
1 files changed, 181 insertions, 0 deletions
diff --git a/docs/index.md b/docs/index.md
new file mode 100644
index 0000000..6eb0823
--- /dev/null
+++ b/docs/index.md
@@ -0,0 +1,181 @@
+# chents
+
+[![Online
+documentation](https://img.shields.io/badge/docs-jrwb.de-blue.svg)](https://pkgdown.jrwb.de/chents/)
+[![R-Universe
+status](https://jranke.r-universe.dev/badges/chents)](https://jranke.r-universe.dev/chents)
+[![Code
+coverage](https://img.shields.io/badge/coverage-jrwb.de-blue.svg)](https://pkgdown.jrwb.de/chents/coverage/coverage.html)
+
+When working with data on chemical substances, we often need a reliable
+link between the data and the chemical identity of the substances. The R
+package **chents** provides a way to define an R object corresponding to
+a chemically defined substances (“chemical entity”) and to collect
+related information.
+
+When first defining a chemical entity, some chemical information is
+retrieved from the [PubChem](https://pubchem.ncbi.nlm.nih.gov/) website
+using the [webchem](https://docs.ropensci.org/webchem/) package.
+
+``` r
+library(chents)
+caffeine <- chent$new("Caffeine")
+#> Querying PubChem for name Caffeine ...
+#> Get chemical information from RDKit using PubChem SMILES
+#> CN1C=NC2=C1C(=O)N(C(=O)N2C)C
+```
+
+If Python and [RDKit](https://rdkit.org) (\> 2015.03) are installed and
+configured for use with the
+[reticulate](https://rstudio.github.io/reticulate/) package, some
+additional chemical information including a 2D graph are computed.
+
+The print method gives an overview of the information that was
+collected.
+
+``` r
+print(caffeine)
+#> <chent>
+#> Identifier $identifier Caffeine
+#> InChI Key $inchikey RYYVLZVUVIJVGH-UHFFFAOYSA-N
+#> SMILES string $smiles:
+#> PubChem
+#> "CN1C=NC2=C1C(=O)N(C(=O)N2C)C"
+#> Molecular weight $mw: 194.2
+#> PubChem synonyms (up to 10):
+#> [1] "caffeine" "58-08-2"
+#> [3] "Guaranine" "1,3,7-Trimethylxanthine"
+#> [5] "Methyltheobromine" "Theine"
+#> [7] "Thein" "Cafeina"
+#> [9] "Caffein" "Cafipel"
+```
+
+There is a very simple plotting method for the chemical structure.
+
+``` r
+plot(caffeine)
+```
+
+![](reference/figures/README-unnamed-chunk-4-1.png)
+
+If you have a so-called ISO common name of a pesticide active
+ingredient, you can use the ‘pai’ class derived from the ‘chent’ class,
+which starts with querying the [BCPC
+compendium](http://www.bcpcpesticidecompendium.org/) first.
+
+``` r
+delta <- pai$new("Deltamethrin")
+#> Querying BCPC for Deltamethrin ...
+#> Querying PubChem for inchikey OWZREIFADZCYQD-NSHGMRRFSA-N ...
+#> Get chemical information from RDKit using PubChem SMILES
+#> CC1([C@H]([C@H]1C(=O)O[C@H](C#N)C2=CC(=CC=C2)OC3=CC=CC=C3)C=C(Br)Br)C
+plot(delta)
+```
+
+![](reference/figures/README-unnamed-chunk-5-1.png)
+
+Additional information can be read from a local .yaml file. This
+information can be leveraged e.g. by the
+[PEC_soil](https://pkgdown.jrwb.de/pfm/reference/PEC_soil.html) function
+of the ‘pfm’ package. However, this functionality is to be superseded by
+a dedicated package, defining data for the environmental risk assessment
+on chemicals, in particular on active ingredients of plant protection
+products.
+
+## Installation
+
+You can conveniently install chents from the repository kindly made
+available by the R-Universe project:
+
+``` r
+install.packages("chents",
+ repos = c("https://jranke.r-universe.dev", "https://cran.r-project.org"))
+```
+
+In order to profit from the chemoinformatics, you need to install RDKit
+and its python bindings. On a Debian type Linux distribution, just use
+
+``` sh
+sudo apt install python3-rdkit
+```
+
+If you use this package on Windows or MacOS, I would be happy to include
+installation instructions here if you share them with me, e.g. via a
+Pull Request.
+
+## Configuration of the Python version to use
+
+On Debian type Linux distributions, you can use the following line in
+your global or project specific `.Rprofile` file to tell the
+`reticulate` package to use the system Python version that will find the
+RDKit installed in the system location.
+
+``` r
+Sys.setenv(RETICULATE_PYTHON="/usr/bin/python3")
+```
+
+## Using R6 classes
+
+Note that the `chent` objects defined by this package are
+[R6](https://r6.r-lib.org/articles/Introduction.html) classes.
+Therefore, if you think you make a copy by assigning them to a new name,
+the objects will still be connected, because only the reference is
+copied. For example, you can create a molecule without retrieving data
+from PubChem
+
+``` r
+but <- chent$new("Butane", smiles = "CCCC", pubchem = FALSE)
+#> Get chemical information from RDKit using user SMILES
+#> CCCC
+print(but)
+#> <chent>
+#> Identifier $identifier Butane
+#> InChI Key $inchikey NA
+#> SMILES string $smiles:
+#> user
+#> "CCCC"
+#> Molecular weight $mw: 58.1
+```
+
+If you then assign a new name and add PubChem information to the object
+with the new name, the information will also be added to the original
+`chent` object:
+
+``` r
+but_pubchem <- but
+but_pubchem$try_pubchem()
+#> Querying PubChem for name Butane ...
+print(but)
+#> <chent>
+#> Identifier $identifier Butane
+#> InChI Key $inchikey IJDNQMDRQITEOD-UHFFFAOYSA-N
+#> SMILES string $smiles:
+#> user PubChem
+#> "CCCC" "CCCC"
+#> Molecular weight $mw: 58.1
+#> PubChem synonyms (up to 10):
+#> [1] "BUTANE" "n-Butane" "106-97-8"
+#> [4] "Diethyl" "Methylethylmethane" "Butanen"
+#> [7] "Butani" "Butyl hydride" "HC 600"
+#> [10] "A 21 (lowing agent)"
+```
+
+You can create a derived, independent object using the `clone()` method
+that will not be affected by operations on the original object:
+
+``` r
+but_new <- chent$new("Butane", smiles = "CCCC", pubchem = FALSE)
+#> Get chemical information from RDKit using user SMILES
+#> CCCC
+but_clone <- but_new$clone()
+but_new$try_pubchem()
+#> Querying PubChem for name Butane ...
+but_clone
+#> <chent>
+#> Identifier $identifier Butane
+#> InChI Key $inchikey NA
+#> SMILES string $smiles:
+#> user
+#> "CCCC"
+#> Molecular weight $mw: 58.1
+```

Contact - Imprint