The OELibraryGen was designed to give programmers a high degree
of control when applying chemical transformations. It was also
designed for efficiency. Potentially costly preprocessing is performed
a single time before transformations can be carried out. The relative
setup cost of a OELibraryGen instance may be high, and the
memory use large as preprocessed reactants are stored in memory.
Subsequent generation of products,however, is very efficient because
setup costs are paid in advance. The OELibraryGen class serves
a dual purpose of managing sets of preprocessed starting materials,
and storing a list of chemical transform operations defined by a
reaction molecule.
Chemical transform operations are carried out on starting materials.
Starting materials provide most of the virtual matter that goes into
making virtual product molecules. The OELibraryGen class
provides an interface to associate starting materials with reactant
patterns using the OELibraryGen::SetStartingMaterial and
OELibraryGen::AddStartingMaterial methods. These methods
associate starting materials to reactant patterns using the index
(reactant number) of the pattern. Reactant patterns are numbered
starting at zero for the lowest atom index and all atoms that are a
members of the same connected component. The next reactant pattern
begins with the next lowest atom index that is not a member of the
first component. In a SMIRKS pattern the first reactant (reactant
number zero) is the furthest reactant on the left. Disconnected
reactant patterns may be grouped into a single component using
component level grouping in SMIRKS denoted by parentheses.
Once a reaction has been defined, and starting materials have been
associated with each of the reactant patterns, chemical
transformations can be applied to combinations of starting materials.
To achieve a chemically reasonable output attention should be given to
the mode of valence (or hydrogen count) correction that matches the
reaction. The OELibraryGen class has three possible modes of
valence correction: explicit hydrogen, implicit hydrogen , and
automatic. The default mode for valence correction and SMIRKS
interpretation is to emulate the Daylight Reaction Toolkit. Hydrogen
counts are adjusted using explicit hydrogens in SMIRKS patterns.
Reactions are carried out using explicit hydrogens, and valence
correction occurs when explicit hydrogens are added or deleted as
defined by a reaction. The following example demonstrates strict
SMIRKS and explicit hydrogen handling.
1 #include "openeye.h"
2 #include "oechem.h"
3 #include <iostream>
4
5 using namespace std;
6 using namespace OEChem;
7 using namespace OESystem;
8
9 int main()
10 {
11 OELibraryGen libgen("[O:1]=[C:2][Cl:3].[N:4][H:5]>>[O:1]=[C:2][N:4]");
12
13 OEGraphMol mol;
14 OEParseSmiles(mol,"CC(=O)Cl");
15 libgen.SetStartingMaterial(mol,0);
16
17 mol.Clear();
18 OEParseSmiles(mol,"NCC");
19 libgen.SetStartingMaterial(mol,1);
20
21 OEIter<OEMolBase> product;
22 for (product = libgen.GetProducts();product;++product)
23 {
24 std::string smi;
25 OECreateCanSmiString(smi,product);
26 cout << "smiles = " << smi << endl;
27 }
28
29 return 0;
30 }
In the amide bond forming reaction a hydrogen atom attached to the nitrogen in the amine pattern is explicitly deleted when forming the product. When executed, the example generates two products in total. Each product corresponds to the equivalent protons attached to the amine. If a unique set of products is desired, canonical smiles strings may be stored for verification that products generated are indeed unique.
The following demonstrates how the same basic reaction given in the previous example can be carried out using the implicit hydrogen correction mode. Notice that no explicit hydrogens appear in the reaction. Instead, the SMARTS implicit hydrogen count operator appears on the right hand side of the reaction and is used to assign the implicit hydrogen count of the product nitrogen.
1 #include "openeye.h"
2 #include "oechem.h"
3 #include <iostream>
4
5 using namespace std;
6 using namespace OEChem;
7 using namespace OESystem;
8
9 int main()
10 {
11 OELibraryGen libgen("[O:1]=[C:2][Cl:3].[N:4]>>[O:1]=[C:2][Nh1:4]");
12 libgen.SetExplicitHydrogens(false);
13
14 OEGraphMol mol;
15 OEParseSmiles(mol,"CC(=O)Cl");
16 libgen.SetStartingMaterial(mol,0);
17
18 mol.Clear();
19 OEParseSmiles(mol,"NCC");
20 libgen.SetStartingMaterial(mol,1);
21
22 OEIter<OEMolBase> product;
23 for (product = libgen.GetProducts();product;++product)
24 {
25 std::string smi;
26 OECreateCanSmiString(smi,product);
27 cout << "smiles = " << smi << endl;
28 }
29
30 return 0;
31 }
The reaction is written to work with implicit hydrogens (using the
lowercase 'h' primitive), and the OELibraryGen instance is set
to work in implicit hydrogen mode using the
OELibraryGen::SetExplicitHydrogens method.
The final example demonstrates automatic valence correction. In implicit hydrogen mode (set using the OELibraryGen::SetExplicitHydrogens method) automatic valence correction attempts to add or subtract implicit hydrogens in order to retain the valence state observed in the starting materials. Before chemical transformations commence, the valence state for each reacting atom is recorded. After the transform operations are complete the implicit hydrogen count is adjusted to match the beginning state of the reacting atoms. Changes in formal charge are taken into account during the valence correction.
1 #include "openeye.h"
2 #include "oechem.h"
3 #include <iostream>
4
5 using namespace std;
6 using namespace OEChem;
7 using namespace OESystem;
8
9 int main()
10 {
11 OELibraryGen libgen("[O:1]=[C:2][Cl:3].[N:4]>>[O:1]=[C:2][N:4]");
12 libgen.SetExplicitHydrogens(false);
13 libgen.SetValenceCorrection(true);
14
15 OEGraphMol mol;
16 OEParseSmiles(mol,"CC(=O)Cl");
17 libgen.SetStartingMaterial(mol,0);
18
19 mol.Clear();
20 OEParseSmiles(mol,"NCC");
21 libgen.SetStartingMaterial(mol,1);
22
23 OEIter<OEMolBase> product;
24 for (product = libgen.GetProducts();product;++product)
25 {
26 std::string smi;
27 OECreateCanSmiString(smi,product);
28 cout << "smiles = " << smi << endl;
29 }
30
31 return 0;
32 }
In general, automatic valence correction is a convenience that allows straightforward reactions to be written in simplified manner and reduces the onus of valence state bookkeeping. Reactions that alter the preferred valence state of an atom, oxidation for example, may not be automatically correctable.
OELibraryGen objects are normally initialized with a SMIRKS pattern. A boolean argument is used to specify whether the SMIRKS
string should be interpreted using strict SMIRKS semantics. Here
strict means in full compliance with the SMIRKS language defined by
its originator, Daylight CIS , Inc. If the default value of true is
used, the SMIRKS string must have corresponding reaction mapped
reactant and product atoms. Mapped product atoms that do not have
corresponding mapped reactant atoms are considered invalid SMIRKS and
will result in a failure to initialize the OELibraryGen instance. Strict SMIRKS also requires unmapped reactant atoms to be
destroyed in the reaction. Passing a boolean value of false to the
second method argument will relax both of the strict SMIRKS
restrictions.
The AddStartingMaterial and SetStartingMaterial methods
are used to initialize the starting materials corresponding to a
reaction component (reactant). An iterator over molecules or a single
molecule may be passed as the first argument to the methods.
Subsequent calls to the AddStartingMaterial method append to
the list of starting materials set in prior calls. The second
argument specifies the reactant by number, starting with zero, to
which the starting materials correspond. These numbers correspond
with the left to right lexical ordering of reactants in the SMIRKS.
The final argument is used to control the pattern matching of the reactant
pattern to the staring material. If the value passed is true, only
matches that contain a unique set of atoms relative to previously
identified matches are used. If the value is false, every possible
match including those related by symmetry will be used. Reactant
patterns are unique matched by default.
The SetExplicitHydrogens method sets the hydrogen handling mode for the
OELibraryGen instance. OELibraryGen instance are
constructed by default with the explicit hydrogen mode set to true.
Reactions may be executed using either implicit or explicit hydrogens
represented in the starting materials for a reaction. If the value is
true, the OELibraryGen instance will add explicit hydrogens to
reactant molecules when they are initialized using either of the
SetStartingMaterial methods. If the value is false, then both of the
SetStartingMaterial methods will suppress any explict hydrogens in the
reactant molecules, and simply retain the implicit hydrogen counts for
remaining non-hydrogen atoms. The hydrogen handling mode must be
assigned prior to calling SetStartingMaterial. Calling
SetExplicitHydrogens after SetStartingMaterial will have no
effect. Note that the explicit hydrogen setting in effect modifies
the semantics of smirks. If the programmer wishes to implement strict
SMIRKS according to the Daylight standard, in full, explicit hydrogens
should be set on.
The SetValenceCorrection method controls the valence correction
mode setting of an OELibraryGen instance. OELibraryGen
instances are constructed by default with the valence correction mode
set to false. Valence correction mode can be turned on by passing a
boolean true value to an OELibraryGen instance using this
method. When valence correction mode is enabled, the
OELibraryGen instance will attempt to adjust the
hydrogen count on atoms in the product molecule that are involved in
the reaction to match the original valence state of the reactant. For
product atoms that do not undergo a nuclear reaction (atomic number is
retained), the hydrogen count is either increased or decreased to match
the initial valence state of the corresponding reactant atom. Formal
charge is taken into account during the hydrogen count adjustment.
Note that valence correction in effect modifies the semantics of
smirks. Thus, if the programmer wishes to implement strict SMIRKS
according to the Daylight standard, in full, valence correction
should be set off.