summaryrefslogtreecommitdiff
path: root/README.rmd
blob: 89462ace947bcaa49dc6975ac7aa1b8da91aee8e (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
<!-- README.md is generated from README.rmd. Please edit that file -->

```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-"
)
```

# chents

[![Online documentation](https://img.shields.io/badge/docs-jrwb.de-blue.svg)](https://pkgdown.jrwb.de/chents/)
[![R-Universe status](https://jranke.r-universe.dev/badges/chents)](https://jranke.r-universe.dev/chents)
[![Code coverage](https://img.shields.io/badge/coverage-jrwb.de-blue.svg)](https://pkgdown.jrwb.de/chents/coverage/coverage.html)

When working with data on chemical substances, we often need a reliable link between
the data and the chemical identity of the substances. The R package **chents**
provides a way to define an R object corresponding to a chemically defined substances
("chemical entity") and to collect related information.


When first defining a chemical entity, some chemical information
is retrieved from the [PubChem](https://pubchem.ncbi.nlm.nih.gov/) website using
the [webchem](https://docs.ropensci.org/webchem/) package.

```{r}
library(chents)
caffeine <- chent$new("Caffeine")
```

If Python and [RDKit](https://rdkit.org) (> 2015.03) are installed and
configured for use with  the
[reticulate](https://rstudio.github.io/reticulate/) package, some
additional chemical information including a 2D graph are computed.

The print method gives an overview of the information that was collected.

```{r}
print(caffeine)
```

There is a very simple plotting method for the chemical structure.

```{r fig.height = 2}
plot(caffeine)
```

If you have a so-called ISO common name of a pesticide active ingredient, you
can use the 'pai' class derived from the 'chent' class, which starts with querying
the [BCPC compendium](http://www.bcpcpesticidecompendium.org/) first.

```{r fig.height = 3.5}
delta <- pai$new("Deltamethrin")
plot(delta)
```

Additional information can be read from a local .yaml file. This information
can be leveraged e.g. by the
[PEC_soil](https://pkgdown.jrwb.de/pfm/reference/PEC_soil.html) function of the
'pfm' package. However, this functionality is to be superseded by a dedicated
package, defining data for the environmental risk assessment on chemicals,
in particular on active ingredients of plant protection products.


## Installation

You can conveniently install chents from the repository kindly made available by the
R-Universe project:

```{r, eval = FALSE}
install.packages("chents",
  repos = c("https://jranke.r-universe.dev", "https://cran.r-project.org"))
```

In order to profit from the chemoinformatics, you need to install RDKit and its
python bindings. On a Debian type Linux distribution, just use

```{sh, eval = FALSE}
sudo apt install python3-rdkit
```

If you use this package on Windows or MacOS, I would be happy to include
installation instructions here if you share them with me, e.g. via a Pull
Request.

## Configuration of the Python version to use

On Debian type Linux distributions, you can use the following line in your
global or project specific `.Rprofile` file to tell the `reticulate` package to
use the system Python version that will find the RDKit installed in the system
location.

```{r, eval = FALSE}
Sys.setenv(RETICULATE_PYTHON="/usr/bin/python3")
```

## Using R6 classes

Note that the `chent` objects defined by this package are [R6](https://r6.r-lib.org/articles/Introduction.html)
classes. Therefore, if you think you make a copy by assigning them to a new name, the
objects will still be connected, because only the reference is copied. For
example,  you can create a molecule without retrieving data from PubChem

```{r}
but <- chent$new("Butane", smiles = "CCCC", pubchem = FALSE)
print(but)
```

If you then assign a new name and add PubChem information to the object with
the new name, the information will also be added to the original `chent`
object:


```{r}
but_pubchem <- but
but_pubchem$try_pubchem()
print(but)
```

You can create a derived, independent object using the `clone()` method
that will not be affected by operations on the original object:

```{r}
but_new <- chent$new("Butane", smiles = "CCCC", pubchem = FALSE)
but_clone <- but_new$clone()
but_new$try_pubchem()
but_clone
```


Contact - Imprint