Introduction

This is OEChem‘s Python Programming manual.

How to Read this Manual

This is a collection of prose covering many of the important topics which can be addressed by the OEChem library. This manual is meant to be read from front to back at least once. Each topic in this manual is introduced assuming the knowledge presented earlier in the manual has been read. Further, the complexity of topics as well as the complexity of the example code grows as the text progresses. While the initial listings are effectively the ‘’Hello World’’ of OEChem, later examples may require some time to comprehend fully. This manual is filled with example programs. We encourage you to compile, test and modify the examples we present.

See also

Experienced OEChem programmers should use the Application Programming Interface (API) for the most thorough reference of OEChem functionality.

Getting Started

All the Python related code will be found under openeye/python.

For Python to locate OEChem we need to set the PYTHONPATH environment variable. In your shell startup script (.bashrc for example) add the following 2 lines. (The syntax may vary if you use a shell other than bash.)

PYTHONPATH=/usr/local/openeye/python
export PYTHONPATH

This is equivalent to the following Python code.

import sys
sys.path.append("/usr/local/openeye/python")

If you un-tarred in a different parent directory, you would use that actual location rather than /usr/local.

At the beginning of most Python programs, functionality that resides in other files is imported with the import statement. Built-in modules are usually imported like the following:

import os, sys

Then functionality from those modules can be called by prefixing the method name with the module name as in sys.stdout.write() or os.exit(0).

OEChem resides in the openeye module and is routinely
imported as:
from openeye.oechem import *

Where all methods and objects in the oechem namespace are pulled directly into the current script’s namespace. Since OEChem‘s objects and functions have unique names, there is little chance to have a name clash with this particular import * call.

Once the package is imported, objects can be created and methods can be called without the addition of namespace prefixes, resulting in simpler code.

OEChem and Informatics

Chemical information processing is the science of representing molecules in computers. Hence the fundamental “object” or data structure within a chemical information system is that of the molecule, its atoms and its bonds.

A significant problem encountered in such systems is that different applications place differing requirements or constraints on how a molecule is represented. In protein biochemistry, molecules are divided into amino acid residues with specific atom naming and conformational information such as alpha helix or beta sheet. Inorganic chemistry requires isotopic and co-ordination information for atoms and modeling of complex chiralities. One possible solution is to prescribe a single data structure that encodes all of the potential information required of a molecule. However, such an approach suffers from the fact that ‘you can not please all of the chemists, all of the time’. A requirement in the field of chemical databases and substructure searching is that a molecule representation be as compact as possible, to allow as much information to be held in memory as possible and maximize the performance of processing databases from disk. It is this reason that the molecule, its atoms and its bonds are defined in an abstract a manner as possible in OEChem.

The following is a quick guide to the chapters covering these fundamental classes:

OEMolBase

OEAtomBase, OEBondBase