18.4 OEExprOpts Namespace

Pattern matching in OEChem is always done using query molecules or query graphs. Non-query molecules, i.e. those that are derived directly from OEMolBase or OEMCMolBase , must be converted into a query molecule. Conversion into a query molecule is controlled using the values in the OEExprOpts namespace . Expression options can either be specified in the constructor for an OEQMol , or using the convenience constructors in pattern matching classes (OESubSearch , OEMCSSearch , and OECliqueSearch ) which take expression options as as arguments.

Figure 18.9 shows an example where maximum common substructure search is performed using the OEExprOpts_DefaultAtoms and OEExprOpts_DefaultBonds options.

Figure 18.9: Example of maximum common substructure search with DefaultAtoms and DefaultBonds
 
1591

The OEExprOpts_DefaultAtoms option means that two atoms are considered to be equivalent i.e. they can be mapped to each other if they have the same atomic number, aromaticity, and formal charge. The OEExprOpts_DefaultBonds option means that two bonds can be mapped to each other if they have the same bond order and aromaticity.

 1 #!/usr/bin/env python
 2
 3 from openeye.oechem import *
 4 import os,sys
 5
 6 pattern = OEGraphMol()
 7 target  = OEGraphMol()
 8 OEParseSmiles(pattern, "c1(cc(nc2c1C(CCC2)Cl)CCl)O")
 9 OEParseSmiles(target,  "c1(c2c(nc(n1)CF)COC=C2)N")
10
11 atomexpr = OEExprOpts_DefaultAtoms           #
12 bondexpr = OEExprOpts_DefaultBonds           #
13
14 patternQ = OEQMol(pattern)
15 # generate query with atom and bond expression options
16 patternQ.BuildExpressions(atomexpr,bondexpr) #
17 mcss = OEMCSSearch(patternQ)
18
19 unique = True
20 count  = 1
21 # loop over matches
22 for match in mcss.Match(target,unique):
23     sys.stdout.write("\nMatch %d :" % count)
24     sys.stdout.write("\nNumber of matched atoms: %d " % match.NumAtoms())
25     sys.stdout.write("\nNumber of matched bonds: %d " % match.NumBonds())
26     # create match subgraph
27     m = OEGraphMol()
28     OESubsetMol(m,match,True)
29     smi = OECreateCanSmiString(m)
30     sys.stdout.write("\nmatch smiles = %s \n" % smi)
31     count += 1

Listing:18.5 MCSS with atom and bond expression

The best way to understand how various atom and bond expressions influence the pattern matching is to change the atom (line 11) and bond expressions (line 12) in Listing and compare the obtained matches.

After constructing the pattern molecule, the OEQMolBase.BuildExpressions  (line 16) defines the level of atom and bond matching between the pattern molecule and any target molecule.

By modifying the atom and bond expression options, very diverse pattern matching can be performed. Figure 18.10 - Figure 18.14 show several examples where maximum common substructure searches are performed for the same query and target molecules, but with various atom and bond expression options.

In the first example in Figure 18.10, the OEExprOpts_ExactAtoms expression option is used to give a higher degree of discrimination of the equivalence of atoms, i.e. atoms can only be mapped to each other if they have the same degree, number of hydrogens, chirality, mass, and ring membership in addition to the requirements of the OEExprOpts_DefaultAtoms option.

Figure 18.10: ExactAtoms and DefaultBonds
 
1619

Figure 18.11 - Figure 18.14 show examples where the discrimination capability of the OEExprOpts_DefaultAtoms is decreased by adding various modifiers. For example, using the OEExprOpts_EqAromatic modifier, atoms in any aromatic ring systems are considered equivalent. As a result, the pyridine and pyrimidine ring can be mapped to each other in Figure 18.11. Similarly, OEExprOpts_EqHalogen (Figure 18.12) and OEExprOpts_EqONS (Figure 18.13) define equivalency between halogen atoms and oxygen-nitrogen-sulfur atoms, respectively. Using OEExprOpts_EqCAliphaticONS (Figure 18.14) an aliphatic query carbon atom is considered equivalent to any oxygen, nitrogen, or sulphur atom.

Figure 18.11: DefaultAtoms|EqAromatic and DefaultBonds
 
1640
Figure 18.12: DefaultAtoms|EqHalogen and DefaultBonds
 
1650
Figure 18.13: DefaultAtoms|EqONS and DefaultBonds
 
1660
Figure 18.14: DefaultAtoms|EqCAliphaticONS and DefaultBonds
 
1670

Similar modifiers exist for altering bond equivalency. Figure 18.15 shows an example where single and double bonds are considered identical when OEExprOpts_EqSingleDouble modifier is utilized.

Figure 18.15: DefaultAtoms and DefaultBonds|EqSingleDouble
 
1682

The last example in Figure 18.16 represents a very unrestrained search, where both the atom and bond expression options have weak discrimination power.

Figure 18.16: DefaultAtoms|EqAromatic|EqCAliphaticONS|EqHalogen|EqONS and DefaultBonds|EqSingleDouble
 
1693

Even though only maximum common substructure search examples are presented here, atom and bond expression options can be similarly used with substructure searches or clique detections. For a full description of expression options and their usage please refer to the OEExprOpts namespace section in the OEChem namespaces of the OEChem C++ API document.