Pattern matching in OEChem is always done using query molecules or
query graphs. Non-query molecules, i.e. those that are derived
directly from OEMolBase or OEMCMolBase , must be
converted into a query molecule. Conversion into a query molecule is
controlled using the values in the OEExprOpts namespace .
Expression options can either be specified in the constructor for an
OEQMol , or using the convenience constructors in pattern
matching classes (OESubSearch , OEMCSSearch ,
and OECliqueSearch ) which take expression options as as arguments.
Figure 17.9 shows an example where maximum common substructure search is performed using the OEExprOpts::DefaultAtoms and OEExprOpts::DefaultBonds options.
|
The OEExprOpts::DefaultAtoms option means that two atoms are considered to be equivalent i.e. they can be mapped to each other if they have the same atomic number, aromaticity, and formal charge. The OEExprOpts::DefaultBonds option means that two bonds can be mapped to each other if they have the same bond order and aromaticity.
1 #include "openeye.h"
2 #include "oechem.h"
3 #include "oeplatform.h"
4
5 using namespace OEPlatform;
6 using namespace OEChem;
7 using namespace OESystem;
8
9 int main()
10 {
11 OEGraphMol pattern,target;
12 OEParseSmiles(pattern, "c1(cc(nc2c1C(CCC2)Cl)CCl)O");
13 OEParseSmiles(target, "c1(c2c(nc(n1)CF)COC=C2)N");
14
15 unsigned int atomexpr = OEExprOpts::DefaultAtoms;
16 unsigned int bondexpr = OEExprOpts::DefaultBonds;
17
18 OEQMol patternQ(pattern);
19 // generate query with atom and bond expression options
20 patternQ.BuildExpressions(atomexpr,bondexpr);
21 OEMCSSearch mcss(patternQ.QMol());
22
23 bool unique = true;
24 unsigned int count = 1;
25 // loop over matches
26 for (OEIter<OEMatchBase> match = mcss.Match(target,unique);match;++match)
27 {
28 oeout << "Match " << count << ':' << oeendl;
29 oeout << "Number of matched atoms: " << match->NumAtoms() << oeendl;
30 oeout << "Number of matched bonds: " << match->NumBonds() << oeendl;
31 // create match subgraph
32 OEGraphMol m;
33 OESubsetMol(m,match,true);
34 std::string smi;
35 OECreateSmiString(smi,m);
36 oeout << "match smiles = " << smi << oeendl;
37 ++count;
38 }
39 return 0;
40 }
The best way to understand how various atom and bond expressions influence the
pattern matching is to change the atom (line 15) and
bond expressions (line 16) in Listing
After constructing the pattern molecule, the OEQMolBase::BuildExpressions (line 20) defines the level of atom and bond matching between the pattern molecule and any target molecule.
By modifying the atom and bond expression options, very diverse pattern matching can be performed. Figure 17.10 - Figure 17.14 show several examples where maximum common substructure searches are performed for the same query and target molecules, but with various atom and bond expression options.
In the first example in Figure 17.10, the OEExprOpts::ExactAtoms expression option is used to give a higher degree of discrimination of the equivalence of atoms, i.e. atoms can only be mapped to each other if they have the same degree, number of hydrogens, chirality, mass, and ring membership in addition to the requirements of the OEExprOpts::DefaultAtoms option.
|
Figure 17.11 - Figure 17.14 show examples where the discrimination capability of the OEExprOpts::DefaultAtoms is decreased by adding various modifiers. For example, using the OEExprOpts::EqAromatic modifier, atoms in any aromatic ring systems are considered equivalent. As a result, the pyridine and pyrimidine ring can be mapped to each other in Figure 17.11. Similarly, OEExprOpts::EqHalogen (Figure 17.12) and OEExprOpts::EqONS (Figure 17.13) define equivalency between halogen atoms and oxygen-nitrogen-sulfur atoms, respectively. Using OEExprOpts::EqCAliphaticONS (Figure 17.14) an aliphatic query carbon atom is considered equivalent to any oxygen, nitrogen, or sulphur atom.
|
|
|
|
Similar modifiers exist for altering bond equivalency. Figure 17.15 shows an example where single and double bonds are considered identical when OEExprOpts::EqSingleDouble modifier is utilized.
|
The last example in Figure 17.16 represents a very unrestrained search, where both the atom and bond expression options have weak discrimination power.
|
Even though only maximum common substructure search examples are presented here, atom
and bond expression options can be similarly used with substructure searches or
clique detections.
For a full description of expression options and their usage please
refer to the OEExprOpts namespace section in
the OEChem namespaces of the OEChem C++ API document.