# chents [![Online documentation](https://img.shields.io/badge/docs-jrwb.de-blue.svg)](https://pkgdown.jrwb.de/chents/) [![R-Universe status](https://jranke.r-universe.dev/badges/chents)](https://jranke.r-universe.dev/chents) [![Code coverage](https://img.shields.io/badge/coverage-jrwb.de-blue.svg)](https://pkgdown.jrwb.de/chents/coverage/coverage.html) When working with data on chemical substances, we often need a reliable link between the data and the chemical identity of the substances. The R package **chents** provides a way to define an R object corresponding to a chemically defined substances (“chemical entity”) and to collect related information. When first defining a chemical entity, some chemical information is retrieved from the [PubChem](https://pubchem.ncbi.nlm.nih.gov/) website using the [webchem](https://docs.ropensci.org/webchem/) package. ``` r library(chents) caffeine <- chent$new("Caffeine") #> Querying PubChem for name Caffeine ... #> Get chemical information from RDKit using PubChem SMILES #> CN1C=NC2=C1C(=O)N(C(=O)N2C)C ``` If Python and [RDKit](https://rdkit.org) (\> 2015.03) are installed and configured for use with the [reticulate](https://rstudio.github.io/reticulate/) package, some additional chemical information including a 2D graph are computed. The print method gives an overview of the information that was collected. ``` r print(caffeine) #> #> Identifier $identifier Caffeine #> InChI Key $inchikey RYYVLZVUVIJVGH-UHFFFAOYSA-N #> SMILES string $smiles: #> PubChem #> "CN1C=NC2=C1C(=O)N(C(=O)N2C)C" #> Molecular weight $mw: 194.2 #> PubChem synonyms (up to 10): #> [1] "caffeine" "58-08-2" #> [3] "Guaranine" "1,3,7-Trimethylxanthine" #> [5] "Methyltheobromine" "Theine" #> [7] "Thein" "Cafeina" #> [9] "Caffein" "Cafipel" ``` There is a very simple plotting method for the chemical structure. ``` r plot(caffeine) ``` ![](reference/figures/README-unnamed-chunk-4-1.png) If you have a so-called ISO common name of a pesticide active ingredient, you can use the ‘pai’ class derived from the ‘chent’ class, which starts with querying the [BCPC compendium](http://www.bcpcpesticidecompendium.org/) first. ``` r delta <- pai$new("Deltamethrin") #> Querying BCPC for Deltamethrin ... #> Querying PubChem for inchikey OWZREIFADZCYQD-NSHGMRRFSA-N ... #> Get chemical information from RDKit using PubChem SMILES #> CC1([C@H]([C@H]1C(=O)O[C@H](C#N)C2=CC(=CC=C2)OC3=CC=CC=C3)C=C(Br)Br)C plot(delta) ``` ![](reference/figures/README-unnamed-chunk-5-1.png) Additional information can be read from a local .yaml file. This information can be leveraged e.g. by the [PEC_soil](https://pkgdown.jrwb.de/pfm/reference/PEC_soil.html) function of the ‘pfm’ package. However, this functionality is to be superseded by a dedicated package, defining data for the environmental risk assessment on chemicals, in particular on active ingredients of plant protection products. ## Installation You can conveniently install chents from the repository kindly made available by the R-Universe project: ``` r install.packages("chents", repos = c("https://jranke.r-universe.dev", "https://cran.r-project.org")) ``` In order to profit from the chemoinformatics, you need to install RDKit and its python bindings. On a Debian type Linux distribution, just use ``` sh sudo apt install python3-rdkit ``` If you use this package on Windows or MacOS, I would be happy to include installation instructions here if you share them with me, e.g. via a Pull Request. ## Configuration of the Python version to use On Debian type Linux distributions, you can use the following line in your global or project specific `.Rprofile` file to tell the `reticulate` package to use the system Python version that will find the RDKit installed in the system location. ``` r Sys.setenv(RETICULATE_PYTHON="/usr/bin/python3") ``` ## Using R6 classes Note that the `chent` objects defined by this package are [R6](https://r6.r-lib.org/articles/Introduction.html) classes. Therefore, if you think you make a copy by assigning them to a new name, the objects will still be connected, because only the reference is copied. For example, you can create a molecule without retrieving data from PubChem ``` r but <- chent$new("Butane", smiles = "CCCC", pubchem = FALSE) #> Get chemical information from RDKit using user SMILES #> CCCC print(but) #> #> Identifier $identifier Butane #> InChI Key $inchikey NA #> SMILES string $smiles: #> user #> "CCCC" #> Molecular weight $mw: 58.1 ``` If you then assign a new name and add PubChem information to the object with the new name, the information will also be added to the original `chent` object: ``` r but_pubchem <- but but_pubchem$try_pubchem() #> Querying PubChem for name Butane ... print(but) #> #> Identifier $identifier Butane #> InChI Key $inchikey IJDNQMDRQITEOD-UHFFFAOYSA-N #> SMILES string $smiles: #> user PubChem #> "CCCC" "CCCC" #> Molecular weight $mw: 58.1 #> PubChem synonyms (up to 10): #> [1] "BUTANE" "n-Butane" "106-97-8" #> [4] "Diethyl" "Methylethylmethane" "Butanen" #> [7] "Butani" "Butyl hydride" "HC 600" #> [10] "A 21 (lowing agent)" ``` You can create a derived, independent object using the `clone()` method that will not be affected by operations on the original object: ``` r but_new <- chent$new("Butane", smiles = "CCCC", pubchem = FALSE) #> Get chemical information from RDKit using user SMILES #> CCCC but_clone <- but_new$clone() but_new$try_pubchem() #> Querying PubChem for name Butane ... but_clone #> #> Identifier $identifier Butane #> InChI Key $inchikey NA #> SMILES string $smiles: #> user #> "CCCC" #> Molecular weight $mw: 58.1 ``` # Package index ## R6 Class definitions and methods - [`chent`](https://pkgdown.jrwb.de/chents/reference/chent.md) : An R6 class for chemical entities with associated data - [`pai`](https://pkgdown.jrwb.de/chents/reference/pai.md) : An R6 class for pesticidal active ingredients and associated data - [`ppp`](https://pkgdown.jrwb.de/chents/reference/ppp.md) : R6 class for a plant protection product with at least one active ingredient - [`draw_svg.chent()`](https://pkgdown.jrwb.de/chents/reference/draw_svg.chent.md) : Draw SVG graph from a chent object using RDKit - [`plot(`*``*`)`](https://pkgdown.jrwb.de/chents/reference/plot.chent.md) : Plot method for chent objects - [`print(`*``*`)`](https://pkgdown.jrwb.de/chents/reference/print.chent.md) : Printing method for chent objects - [`print(`*``*`)`](https://pkgdown.jrwb.de/chents/reference/print.pai.md) : Printing method for pai objects (pesticidal active ingredients) - [`print(`*``*`)`](https://pkgdown.jrwb.de/chents/reference/print.ppp.md) : Printing method for ppp objects (plant protection products)