Title:

Some New Canonicalization Tools for Chemoinformatics

Author:

Jeremy Yang
OpenEye

Abstract:

A fundamental task of chemistry is identifying distinct chemical entities.  
In chemoinformatics, species must be specified rigorously to
facilitate unambiguous expression of chemical data and knowledge.  
A theoretically equivalent task determining the equality
of two molecules.  However, the meaning of sameness or identity depends
upon the context or hierarchical chemical level of abstraction, for
example, whether stereochemistry or tautomerism is considered.  An
important subset of this problem can be addressed by graph theory
which applies well to valence models for covalently bonded molecules.
Algorithms generating canonical (unique) identifiers for chemical
graphs exist and are available.  However, due to the multiple
contexts mentioned, a single algorithm is not sufficient to solve
all problems.  This study reviews some existing canonicalization
methodology and describes new methods implemented by 
chemoinformatics library OEChem and other OpenEye tools.

yang_canonical_poster_cup6.png