```{r, echo = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-" ) ``` # chents [![Online documentation](https://img.shields.io/badge/docs-jrwb.de-blue.svg)](https://pkgdown.jrwb.de/chents/) [![R-Universe status](https://jranke.r-universe.dev/badges/chents)](https://jranke.r-universe.dev/chents) [![Code coverage](https://img.shields.io/badge/coverage-jrwb.de-blue.svg)](https://pkgdown.jrwb.de/chents/coverage/coverage.html) When working with data on chemical substances, we often need a reliable link between the data and the chemical identity of the substances. The R package **chents** provides a way to define and check the identity of chemically defined substances ("chemical entities") and to collect related information. When first defining a chemical entity, some chemical information is retrieved from the [PubChem](https://pubchem.ncbi.nlm.nih.gov/) website using the [webchem](https://docs.ropensci.org/webchem/) package. ```{r} library(chents) caffeine <- chent$new("Caffeine") ``` If Python and [RDKit](https://rdkit.org) (> 2015.03) are installed and configured for use with the [reticulate](https://rstudio.github.io/reticulate/) package, some additional chemical information including a 2D graph are computed. The print method gives an overview of the information that was collected. ```{r} print(caffeine) ``` There is a very simple plotting method for the chemical structure. ```{r fig.height = 2} plot(caffeine) ``` If you have a so-called ISO common name of a pesticide active ingredient, you can use the 'pai' class derived from the 'chent' class, which starts with querying the [BCPC compendium](http://www.bcpcpesticidecompendium.org/) first. ```{r fig.height = 3.5} delta <- pai$new("Deltamethrin") plot(delta) ``` Additional information can be read from a local .yaml file. This information can be leveraged e.g. by the [PEC_soil](https://pkgdown.jrwb.de/pfm/reference/PEC_soil.html) function of the 'pfm' package. However, this functionality is to be superseded by a dedicated package, defining data for the environmental risk assessment on chemicals, in particular on active ingredients of plant protection products. ## Installation You can conveniently install chents from the repository kindly made available by the R-Universe project: ```{r, eval = FALSE} install.packages("chents", repos = c("https://jranke.r-universe.dev", "https://cran.r-project.org")) ``` In order to profit from the chemoinformatics, you need to install RDKit and its python bindings. On a Debian type Linux distribution, just use ```{sh, eval = FALSE} sudo apt install python3-rdkit ``` If you use this package on Windows or MacOS, I would be happy to include installation instructions here if you share them with me, e.g. via a Pull Request. ## Configuration of the Python version to use On Debian type Linux distributions, you can use the following line in your global or project specific `.Rprofile` file to tell the `reticulate` package to use the system Python version that will find the RDKit installed in the system location. ```{r, eval = FALSE} Sys.setenv(RETICULATE_PYTHON="/usr/bin/python3") ``` ## Using R6 classes Note that the `chent` objects defined by this package are [R6](https://r6.r-lib.org/articles/Introduction.html) classes. Therefore, if you think you make a copy by assigning them to a new name, the objects will still be connected, because only the reference is copied. For example, you can create a molecule without retrieving data from PubChem ```{r} but <- chent$new("Butane", smiles = "CCCC", pubchem = FALSE) print(but) ``` If you then assign a new name and add PubChem information to the object with the new name, the information will also be added to the original `chent` object: ```{r} but_pubchem <- but but_pubchem$try_pubchem() print(but) ``` You can create a derived, independent object using the `clone()` method that will not be affectd by operations on the original object: ```{r} but_new <- chent$new("Butane", smiles = "CCCC", pubchem = FALSE) but_clone <- but_new$clone() but_new$try_pubchem() but_clone ```