Title:

Chemoinformatics Applications for Chemists using OpenEye and Oracle

Author:

Paul Watson
Arena Pharmaceuticals

Abstract:

A fully integrated Java-based chemoinformatics platform has been developed
which incorporates numerous chemoinformatics tools for use by the bench
chemist, including virtual screening, lead optimization and lead
development. A database centric approach has been utilized whereby results
from all the tools are stored in Oracle.

The system includes a large database of available compounds from chemical
suppliers (ca. ~5 million records) and the Arena screening library along
with the associated physiochemical properties (ClogP, MW etc.).
Additionally it is possible to import or create ad-hoc databases from the
usual data sources such as SD files, SMILES files and so on. These
databases can be filtered to produce compound subsets using combination
queries created using Boolean logic i.e. create a compound subset that
contains substructure A and substructure B and not substructure C and has
MW between 300 and 500.

Any database in the system can be used either in virtual screening
experiments using ROCS or library enumeration using the OpenEye reaction
toolkit. ROCS searches are carried out on a 32-processor Linux cluster in
parallel using openPBS to schedule the jobs. The results of the ROCS
search are stored in Oracle and can be analyzed using the interface.
Library enumeration can be carried out using ad-hoc reactions to produce
new databases that are again stored in Oracle and can be used like any
other dataset in the system.