17.4 OEExprOpts Namespace

Pattern matching in OEChem is always done using query molecules or query graphs. Non-query molecules, i.e. those that are derived directly from OEMolBase or OEMCMolBase , must be converted into a query molecule. Conversion into a query molecule is controlled using the values in the OEExprOpts namespace . Expression options can either be specified in the constructor for an OEQMol , or using the convenience constructors in pattern matching classes (OESubSearch , OEMCSSearch , and OECliqueSearch ) which take expression options as as arguments.

Figure 17.9 shows an example where maximum common substructure search is performed using the OEExprOpts::DefaultAtoms and OEExprOpts::DefaultBonds options.

Figure 17.9: Example of maximum common substructure search with DefaultAtoms and DefaultBonds
 
2388

The OEExprOpts::DefaultAtoms option means that two atoms are considered to be equivalent i.e. they can be mapped to each other if they have the same atomic number, aromaticity, and formal charge. The OEExprOpts::DefaultBonds option means that two bonds can be mapped to each other if they have the same bond order and aromaticity.

 1 #include "openeye.h"
 2 #include "oechem.h"
 3 #include "oeplatform.h"
 4
 5 using namespace OEPlatform;
 6 using namespace OEChem;
 7 using namespace OESystem;
 8
 9 int main()
10 {
11   OEGraphMol pattern,target;
12   OEParseSmiles(pattern, "c1(cc(nc2c1C(CCC2)Cl)CCl)O");
13   OEParseSmiles(target,  "c1(c2c(nc(n1)CF)COC=C2)N");
14
15   unsigned int atomexpr = OEExprOpts::DefaultAtoms;
16   unsigned int bondexpr = OEExprOpts::DefaultBonds;
17
18   OEQMol patternQ(pattern);
19   // generate query with atom and bond expression options
20   patternQ.BuildExpressions(atomexpr,bondexpr);
21   OEMCSSearch mcss(patternQ.QMol());
22
23   bool unique = true;
24   unsigned int count = 1;
25   // loop over matches
26   for (OEIter<OEMatchBase> match = mcss.Match(target,unique);match;++match)
27   {
28     oeout << "Match " << count << ':' << oeendl;
29     oeout << "Number of matched atoms: " << match->NumAtoms() << oeendl;
30     oeout << "Number of matched bonds: " << match->NumBonds() << oeendl;
31     // create match subgraph
32     OEGraphMol m;
33     OESubsetMol(m,match,true);
34     std::string smi;
35     OECreateSmiString(smi,m);
36     oeout << "match smiles = " << smi << oeendl;
37     ++count;
38   }
39   return 0;
40 }

Listing:17.5 MCSS with atom and bond expression

The best way to understand how various atom and bond expressions influence the pattern matching is to change the atom (line 15) and bond expressions (line 16) in Listing and compare the obtained matches.

After constructing the pattern molecule, the OEQMolBase::BuildExpressions  (line 20) defines the level of atom and bond matching between the pattern molecule and any target molecule.

By modifying the atom and bond expression options, very diverse pattern matching can be performed. Figure 17.10 - Figure 17.14 show several examples where maximum common substructure searches are performed for the same query and target molecules, but with various atom and bond expression options.

In the first example in Figure 17.10, the OEExprOpts::ExactAtoms expression option is used to give a higher degree of discrimination of the equivalence of atoms, i.e. atoms can only be mapped to each other if they have the same degree, number of hydrogens, chirality, mass, and ring membership in addition to the requirements of the OEExprOpts::DefaultAtoms option.

Figure 17.10: ExactAtoms and DefaultBonds
 
2416

Figure 17.11 - Figure 17.14 show examples where the discrimination capability of the OEExprOpts::DefaultAtoms is decreased by adding various modifiers. For example, using the OEExprOpts::EqAromatic modifier, atoms in any aromatic ring systems are considered equivalent. As a result, the pyridine and pyrimidine ring can be mapped to each other in Figure 17.11. Similarly, OEExprOpts::EqHalogen (Figure 17.12) and OEExprOpts::EqONS (Figure 17.13) define equivalency between halogen atoms and oxygen-nitrogen-sulfur atoms, respectively. Using OEExprOpts::EqCAliphaticONS (Figure 17.14) an aliphatic query carbon atom is considered equivalent to any oxygen, nitrogen, or sulphur atom.

Figure 17.11: DefaultAtoms|EqAromatic and DefaultBonds
 
2437
Figure 17.12: DefaultAtoms|EqHalogen and DefaultBonds
 
2447
Figure 17.13: DefaultAtoms|EqONS and DefaultBonds
 
2457
Figure 17.14: DefaultAtoms|EqCAliphaticONS and DefaultBonds
 
2467

Similar modifiers exist for altering bond equivalency. Figure 17.15 shows an example where single and double bonds are considered identical when OEExprOpts::EqSingleDouble modifier is utilized.

Figure 17.15: DefaultAtoms and DefaultBonds|EqSingleDouble
 
2479

The last example in Figure 17.16 represents a very unrestrained search, where both the atom and bond expression options have weak discrimination power.

Figure 17.16: DefaultAtoms|EqAromatic|EqCAliphaticONS|EqHalogen|EqONS and DefaultBonds|EqSingleDouble
 
2490

Even though only maximum common substructure search examples are presented here, atom and bond expression options can be similarly used with substructure searches or clique detections. For a full description of expression options and their usage please refer to the OEExprOpts namespace section in the OEChem namespaces of the OEChem C++ API document.