Pattern matching in OEChem is always done using query molecules or
query graphs. Non-query molecules, i.e. those that are derived
directly from OEMolBase or OEMCMolBase , must be
converted into a query molecule. Conversion into a query molecule is
controlled using the values in the OEExprOpts namespace .
Expression options can either be specified in the constructor for an
OEQMol , or using the convenience constructors in pattern
matching classes (OESubSearch , OEMCSSearch ,
and OECliqueSearch ) which take expression options as as arguments.
Figure 18.9 shows an example where maximum common substructure search is performed using the OEExprOpts_DefaultAtoms and OEExprOpts_DefaultBonds options.
|
The OEExprOpts_DefaultAtoms option means that two atoms are considered to be equivalent i.e. they can be mapped to each other if they have the same atomic number, aromaticity, and formal charge. The OEExprOpts_DefaultBonds option means that two bonds can be mapped to each other if they have the same bond order and aromaticity.
1 #!/usr/bin/env python
2
3 from openeye.oechem import *
4 import os,sys
5
6 pattern = OEGraphMol()
7 target = OEGraphMol()
8 OEParseSmiles(pattern, "c1(cc(nc2c1C(CCC2)Cl)CCl)O")
9 OEParseSmiles(target, "c1(c2c(nc(n1)CF)COC=C2)N")
10
11 atomexpr = OEExprOpts_DefaultAtoms #
12 bondexpr = OEExprOpts_DefaultBonds #
13
14 patternQ = OEQMol(pattern)
15 # generate query with atom and bond expression options
16 patternQ.BuildExpressions(atomexpr,bondexpr) #
17 mcss = OEMCSSearch(patternQ)
18
19 unique = True
20 count = 1
21 # loop over matches
22 for match in mcss.Match(target,unique):
23 sys.stdout.write("\nMatch %d :" % count)
24 sys.stdout.write("\nNumber of matched atoms: %d " % match.NumAtoms())
25 sys.stdout.write("\nNumber of matched bonds: %d " % match.NumBonds())
26 # create match subgraph
27 m = OEGraphMol()
28 OESubsetMol(m,match,True)
29 smi = OECreateCanSmiString(m)
30 sys.stdout.write("\nmatch smiles = %s \n" % smi)
31 count += 1
The best way to understand how various atom and bond expressions influence the
pattern matching is to change the atom (line 11) and
bond expressions (line 12) in Listing
After constructing the pattern molecule, the OEQMolBase.BuildExpressions (line 16) defines the level of atom and bond matching between the pattern molecule and any target molecule.
By modifying the atom and bond expression options, very diverse pattern matching can be performed. Figure 18.10 - Figure 18.14 show several examples where maximum common substructure searches are performed for the same query and target molecules, but with various atom and bond expression options.
In the first example in Figure 18.10, the OEExprOpts_ExactAtoms expression option is used to give a higher degree of discrimination of the equivalence of atoms, i.e. atoms can only be mapped to each other if they have the same degree, number of hydrogens, chirality, mass, and ring membership in addition to the requirements of the OEExprOpts_DefaultAtoms option.
|
Figure 18.11 - Figure 18.14 show examples where the discrimination capability of the OEExprOpts_DefaultAtoms is decreased by adding various modifiers. For example, using the OEExprOpts_EqAromatic modifier, atoms in any aromatic ring systems are considered equivalent. As a result, the pyridine and pyrimidine ring can be mapped to each other in Figure 18.11. Similarly, OEExprOpts_EqHalogen (Figure 18.12) and OEExprOpts_EqONS (Figure 18.13) define equivalency between halogen atoms and oxygen-nitrogen-sulfur atoms, respectively. Using OEExprOpts_EqCAliphaticONS (Figure 18.14) an aliphatic query carbon atom is considered equivalent to any oxygen, nitrogen, or sulphur atom.
|
|
|
|
Similar modifiers exist for altering bond equivalency. Figure 18.15 shows an example where single and double bonds are considered identical when OEExprOpts_EqSingleDouble modifier is utilized.
|
The last example in Figure 18.16 represents a very unrestrained search, where both the atom and bond expression options have weak discrimination power.
|
Even though only maximum common substructure search examples are presented here, atom
and bond expression options can be similarly used with substructure searches or
clique detections.
For a full description of expression options and their usage please
refer to the OEExprOpts namespace section in
the OEChem namespaces of the OEChem C++ API document.