Title:
Searching US Patent Data for Chemical Structures
Author:
James W. Cooper
IBM
Abstract:
We have developed a set of text annotators that accurately recognize
organic chemical names in scientific text. We applied these
annotators to a year of US patents, discovering over half a million
chemical names. Using OGHAM we converted these to SMILES strings and
have built a searchable structural database that allows us to search
for patents containing specified chemical substructures regardless
of the IUPAC or trivial names used in the patents. We will describe
how the system can be used and ongoing work to enhance this system.
cooper_cup6.ppt