English  Japanese  
OpenEye Scientific Software  
SEARCH: 
about us science products business support
  news events
  
 home > support > faq    printer friendly

FAQ

Frequently Asked Questions

General/Misc Omega Fred Rocs Zap
OEShape Quacpac Filter Vida OEChem
OEWrappers - Python Lexichem PVM/parallelization Chemistry

General/Miscellaneous

OMEGA

NOTE: pertains to Omega v2.2 unless otherwise noted.

Zap

FRED

ROCS

VIDA

NOTE: pertains to Vida v2.0 unless otherwise noted.

Filter

OEShape

OEChem

OEWrappers - Python

Lexichem

PVM and Parallelization

Quacpac

Chemistry

Answers

General/Miscellaneous

  • How do I download OpenEye software?
  • Either

    1. Go to the download page linked to the OpenEye home page, or
    2. Anonymous FTP to ftp.eyesopen.com and cd pub

  • How do I install my license file oe_license.txt?
  • License files should be defined by environment variable $OE_LICENSE. It is recommended that these files be located at OE_DIR=/usr/local/openeye/etc/oe_license.txt. License files can also be used if named oe_license.txt and either (1) present in the current working directory or (2) present in the directory defined by environment variable $OE_DIR. However, these methods are not as reliable and not generally recommended.

  • What's the deal with CYGWIN?
  • CYGWIN is a Unix environment for Windows, available at http://www.cygwin.com/. Some OpenEye products are compiled for Windows using CYGWIN. However, CYGWIN installation should not be required at runtime. One difference between CYGWIN-built and native-windows built software is how absolute file paths may be specified, unix-style or windows-style.

  • Why doesn't my license work? (#1)
  • If a Windows mail program was used to receive the license, one possibility is that carriage return characters (ascii 13) were inadvertently added to the file. Remove these using the dos2unix command or the following Perl command:

    perl -pi -e 's/\r\n*/\n/g;' oe_license.txt

  • Why doesn't my license work? (#2)
  • In general, OE licenses may be concatenated and remain valid. However, some exceptions to this rule exist. Old-format and new-format licenses cannot be combined. Within an old-format license, Zap licenses may interfere with other licenses following. And vice-versa: other product licenses may interfere with Zap licenses which follow. To be safe, in old-format license files, keep Zap licenses in a separate file from other product licenses.

  • Where/how should I install OpenEye software?
  • All software packages are installed into a directory named openeye, typically at /usr/local/openeye, and defined by environment variable $OE_DIR. Typical example:

    $ pwd
    /usr/local
    $ tar xzf $HOME/omega-1.8.1-redhat-3.0WS-g++3.3-i586.tar.gz
    $ tar xzf $HOME/szybki-1.0-redhat-3.0WS-g++3.4-i586.tar.gz

    Several subdirectories are used consistently among packages:

    /usr/local/openeye/ $OE_DIR
    /usr/local/openeye/arch/ platform dependent executables and libs
    /usr/local/openeye/bin/ common place for executables, typically links to platform dependent files in arch
    /usr/local/openeye/data/ data files used by apps and libs
    /usr/local/openeye/docs/ documentation: html, pdf, etc.
    /usr/local/openeye/etc/ license files, config files, etc.
    /usr/local/openeye/etc/oe_license.txt new-format license (defined by $OE_LICENSE)
    /usr/local/openeye/toolkits/examples/ example C++ source code
    /usr/local/openeye/toolkits/include/ include files for libs
    /usr/local/openeye/toolkits/lib/ static and shared libs, typically links to platform dependent files in arch
    /usr/local/openeye/wrappers/python/ Python toolkit wrappers, e.g., PyOEChem package
    /usr/local/openeye/wrappers/python/examples/ example Python source code
    /usr/local/openeye/wrappers/java/ Java toolkit wrappers
    /usr/local/openeye/wrappers/java/examples/ example Java source code

  • What are the differences between OpenEye (OEChem) SMARTS and Daylight SMARTS?
  • Upcarat indicates the hybridization: ^3 means SP3, ^2 means SP2. This feature is used in the Omega torlib.txt file. Another difference is the semantics of primitive [R], which in Daylight means number of SSSR rings an atom is in. In OEChem, this means the number of ring bonds (e.g., benzene atoms are all [R2]), while [R]) means any non-zero number of ringbonds. This discrepancy is unfortunate but based on the weakness of SSSR (as explained in the OEChem manual), which is not a rigorous concept, and the algorithms for which are arbitrary and not deterministic.

  • Windows Error: msvcp71.dll and msvcr71.dll are missing.
  • These DLLs comprise the needed runtime environment for applications built with MS Visual C++ 7.1, such as many OpenEye windows applications. They may be freely obtained and used, and are available for our customers' convenience at ftp://ftp.eyesopen.com/pub/misc/MSVC71runtime.zip.

  • Windows Application Error... "The application failed to initialize properly."

    This can mean that a needed DLL is present but not executable. Set on the executable property of the DLL.

  • How should I cite OpenEye software in a publication?

    Sample citation:

    OEChem, version 1.3.4, OpenEye Scientific Software, Inc., Santa Fe, NM, USA, www.eyesopen.com, 2005.

  • Windows Error: libmmd.dll is missing.
  • This DLL is the needed as runtime environment for applications built with the Intel C++ Compiler. The DLL is freely obtained and used, and is available for our customers' convenience at ftp://ftp.eyesopen.com/pub/misc/INTELlibmmd.zip.

  • My MOL2 file is interpreted strangely? What is wrong?
  • Please see MOL2 files for Dummies, Roger Sayle, CUPV, Feb. 2004.

  • What elements are handled by MMFF94?
  • MMFF94 handles a specific list of atom types, where a type is defined by the element and its valence state. For typical organic chemistry environments the following elements are handled:

    C, N, O, F, S, P, Cl, Br, I, Si, H

    Also, the following ions:

    Fe+2, Fe+3, F-, Cl-, Br-, Li+, Na+, K+, Zn+2, Ca+2, Cu+1, Cu+2, Mg+2

    OpenEye applications which utilize MMFF94 such as Omega and Szybki will thus be similarly limited.

  • What is Centos? How is it related to RedHat Enterprise Edition?
  • As stated at www.centos.org:

    CentOS is an Enterprise-class Linux Distribution derived from sources freely provided to the public by a prominent North American Enterprise Linux vendor. CentOS conforms fully with the upstream vendors redistribution policy and aims to be 100% binary compatible.

    So, OpenEye provides RedHat Enterprise Linux (RHEL) compatible versions in some cases by means of CentOS.

OMEGA

    NOTE: pertains to Omega v2.2 unless otherwise noted.

  • Does the order of rules matter in torlib.txt?
  • Yes. Earlier rules take precedence if a subsequent rule conflicts.

  • Why use .oeb (OEBinary) or .oeb.gz as output format?
  • The OEB format is particularly suited to use for Omega output and input by downstream applications.

    1. OEB and better, OEB-gzipped are compact. This is particularly important when parallelizing using PVM.
    2. OEB explicitly stores multiconformer molecules as such. Alternatives such as SDF do not and thus are fundamentally less reliable at preserving these relationships.
    3. OEB, like SDF, allows storage of generic data such as energies and scores.
    4. Omega 2.0+ requires OEB output in PVM mode.
    5. As of Omega 2.2, -rotorOffsetCompress is true by default which reduces the size of ouput OEB.

  • Which Omega?
  • If you enter "omega " and get something like:

    This is Omega, Version 3.14159--1.8 (Web2C 7.3.1)
    Copyright (c) 1994--1999 John Plaice and Yannis Haralambous

    ... you're running the wrong program! This Omega is a TeX variant for unicode! Try specifying the full path of the OpenEye Omega.

  • Can Omega enumerate all the stereoisomers of an input molecule?
  • Omega cannot by itself permute the stereo configurations of an input molecule. The recommended approach is to enumerate the desired stereoisomers prior to the Omega run. Omega will infer R/S from the 2D SD file if possible, and stick with that configuration. Included with Omega is an auxilliary program "flipper" which can enumerate stereoisomers for such a purpose. In addition, OEChem includes source code examples for enumerating stereoisomers which can be customized.

  • What elements can Omega handle?
  • C, N, O, F, S, P, Cl, Br, I, Si and H. Molecules containing other elements will be skipped.

  • What does the upcarat ^ symbol mean in the SMARTS in torlib.txt?
  • Upcarat indicates the hybridization: ^3 means SP3, ^2 means SP2. This is one of only two differences between OpenEye SMARTS and Daylight SMARTS; see the OEChem manual for details.

  • What energies are calculated and reported by Omega?
  • The "-ewindow" value is applied relative to the lowest energy conformer. the energies written to output formats which can contain energies (.mol2 and .sdf) are standard MMFF94 energies, in kcal/mol.

  • What force field is used by Omega?
  • The options -buildff and -searchff are both set by default to mmff94s_NoEstat which is designed to improve reproduction of aqueous solution phase conformations. Other choices: mmff, mmff_NoEstat, mmff_Trunc, mmff94s, mmff94s_NoEstat, mmff94s_Trunc.

  • Stochastic versus deterministic: when can Omega results vary for the same inputs?
  • First and foremost, note that inconsistent results between separate Omega executions do not imply that results are incorrect relative to the design goals or stipulated parameters, rmsd, energy, etc. There are no known cases where the inconsistencies discussed reflect actual errors. However, for several reasons it is desireable to have numerically identical results for constant program version, input data and parameters, independent of input order, format, platform, time, phase of the moon, etc.

    Although number of conformers per molecule is a conspicuous result, note that it can be a deceptive measure. Given the combinatorial nature of conformer enumeration, that number should probably be regarded on a logarithmic scale. For example, 440 vs. 220 can result from just one more torsion angle value at one rotor.

    The distance geometry 2D->3D method used in Omega involves a stochastic step which can result in somewhat different results for the same input molecule. Of course "stochastic", on a digital computer, usually and in this case implies pseudo-random numbers based on a seed. It is a design goal that results not vary or vary minimally due to the stochastic step.

    In Omega 2.1+, the input molecule atom order is canonicalized. This eliminates another cause of variation in results.

    With Omega the use of a stochastic step is limited since the program relies on a fragment database which can be large and comprehensive. For any user dataset, it is possible to augment the supplied fragment database so that the fragment database is comprehensive, so the stochastic distance geometry algorithm need not ever be invoked during the Omega run itself (only by makefraglib).

    One other possible source of inconsistent results is floating point precision variations among platforms. These inconsistencies are minimized in Omega by use of platform-independent psedo-random number generators and other methods, but they can exist.

  • How fast is Omega?
  • Using Omega 2.1, on a computer with two dual-core 2.6GHz cpus, 780287 molecules, from the Chembridge public dataset, were processed in 3days, 7hrs, 54min and 31sec. That's 0.369 sec/mol overall, and 1.47 sec/mol on each processor.

Zap

  • Is there a way to download all of ZAP at once?
  • Currently no. The ZAP library for each platform is downloaded separately from example source code and platform-specific compiled binaries. However, here is a bundle of all the example source code: zap_example_src.tgz.

  • Additional documentation...?
  • See the Sybyl/Zap interface manual written by Glen Kellogg which is an excellent primer on PB electrostatics.

FRED

  • Does FRED consider a molecule's internal potential?
  • Scoring functions generally do not have an internal potential. Thus it is possible for a high energy conformation to score well. Conformation quality assurance is left to the conformation generation program (Omega).

  • What does "Unable to dock" mean? Shouldn't FRED report the best fit regardless?
  • The 'unable to dock molecule X' indicates that initial exhaustive search routines could not place the molecule in the pocket (i.e., there are no poses to score). FRED creates a negative image of the receptor site which is defined as the active site box (see the -box flag), minus positions where a probe atom clashes which the protein (see the -clash_checking flag) and minus positions which have a very low shape score (see the -neg_img_size flag). FRED filters its initial exhaustive set of poses through this negative image. Any poses which have atoms that lie outside this image are rejected. This message indicates that this filter rejected every pose of the initial exhaustive ensemble. Increasing -neg_img_size has a very good chance of fixing the issue. Decreasing -clash_checking may help also if you have a very tightly bound ligand, this however will allow the ligand to clash more with the protein (if you need to do this to reproduce an experimentally determined structure that structure likely has some fairly significant clashes). Finally you may simply need to increase the size of your box (there is a flag -addbox that provides a simple mechanism to do this, or you can just create a new box). Now in v2.0, the -neg_img_size specifies a size relative to the box.

  • How is FRED related to other OpenEye technologies and products?
  • The FRED algorithm is based on a "Gaussian docking function", which derived from the same work as OpenEye's OEShape toolkit (Grant, Nicholls, et al.). Much of the theory of shape and surfaces thus applies to FRED as well. FRED also has integrated Zap Poisson-Boltzman electrostatics technology via the zap_bind scoring function.

  • What is the fredA/fredPA program?
  • fredA (now called fredPA) is short for FRED Analysis, and is designed to take a bound ligand, redock it and then report the RMSDs of the redocked ligand poses relative to the bound pose, plus other details of the redocking process. Conformations are recalculated internally using Omega.

  • How is the best box size determined?
  • Not too small, but not too big. Here's the explanation. FRED generates a negative image of the receptor site; any pose that does not fit inside this image is rejected by FRED. If this image is too small FRED will reject every possible pose, resulting in a "Failed to Dock" message for that molecule. The image is constructed as follows:

    1) Start with the entire enclosed volume of the box.
    2) Remove positions within the box that clash with the protein.
    3) Remove positions with poor shape score.

    Steps 1 & 2 are straightforward. Step 3 needs some explanation. Poor shape score refers to poor shape score w.r.t. the Gaussian Shape Scoring Function, and FRED selects a cutoff value for the shape score of a probe atom. Any position where the probe atom has a shape score below the cutoff value is removed from the negative image. The cutoff value for the scores isn't directly specified. What is specified is total volume of the negative image of the receptor site, via option -excvol. FRED chooses a cutoff such that it creates an image with the specified volume. If you increase your box size, FRED will increase the cutoff score such that the size of the negative image remains constant. This can cause it to reject poses it would accept with a smaller box, because positions that previously had acceptable scores are now rejected. To increase box size and retain previously acceptable poses, increase -excvol also.

    Another reason not to use too large a box is simply that FRED will require more time and steps to enumerate all possible poses in the box.

  • Scoring function references:
  • PLP

    D.K. Gehlhaar, G.M. Verkhivker, P.A. Rejto, C.J. Sherman, D.B. Fogel, L.J. Fogel and S.T. Freer, "Molecular recognition of the inhibitor AG-1343 by HIV-1 protease: Conformationally flexible docking by evolutionary programming", Chemistry & Biology 1995, 2, 317-324.

    Chemscore
    1. M.D. Eldridge, C.W. Murray, T.R. Auton, G.V. Paolini, and R.P. Mee, "Empirical scoring functions: I. the development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexes", J. Computer-Aided Molecular Design 11:425-445 (1997).
    2. C.A. Baxter, C.W. Murray, D.E. Clark, D.R. Westhead and M.D. Eldridge, "Flexible docking using TABU search and an empirical estimate of binding affinity", Proteins, 33, 367-382 (1998).

    Screenscore
    M. Stahl and M. Rarey, "Detailed Analysis of Scoring Functions for Virtual Screening", J. Med. Chem, 2001, 44, 1035-1042.

    ShapeGauss, ChemGauss
    Mark R. McGann, Harold R. Almond, Anthony Nicholls, J. Andrew Grant, and Frank K. Brown, "Gaussian Docking Functions", Biopolymers, Vol. 68, pp. 76-90, 2003.

    Zap_Bind
    J. Andrew Grant, Barry T. Pickup and Anthony Nicholls, "A Smooth Permittivity Function for Poisson-Boltzmann Solvation Methods", Journal of Computational Chemistry, Vol. 22, No. 6, pp. 608-640, 2001.

  • What is the receptor box?
  • A Fred box is rectangular, oriented to the cartesian axes, and defined by a molecule file where the X, Y, and Z ranges are defined by the max/min atom coordinates. Thus, a minimal box file consists of only two atoms, but any molecule file can be used (such as that of a bound ligand).

  • Can one atom satisfy multiple constraints?

    Yes. If multiple defined constraint volumes overlap, one atom located in the overlap volume can satisfy all the constraints (if the smarts patterns are matched).

  • How can a receptor box be defined?

    Regarding Fred 2.0+ box generation, here are some choices:

    1. Use Vida 1.1.2. (Vida 1.3 has some bugs in the Fred setup functionality.) Be sure to rename the resulting box file with the correct suffix. ".box" really should be ".pdb".
    2. Use the .xyz format (see example below, four lines only) and edit by hand. Only two points are needed to define a rectangular box (xmin,ymin,zmin) and (xmax, ymax, zmax).
      2
      test box
      C     -1.00000    2.00000   -4.00000
      C     15.00000    9.50000    7.50000
                                      
    3. Use any existing bound ligand molecule file. The dimensions of the box will be determined by the minimum and maximum x/y/z values of all atoms in the molecule. Use the "-addbox" parameter to increase the extent of each of the six sides of the box, if desired.
    4. Use the PyOEChem example program mol2box.py to generate a box file based on a bound ligand. This code can vary the extents of all six sides of the box independently.
    5. Future: Use a special graphical tool designed for Fred setup, now (August 2005) in development. May be bundled with Fred.

ROCS

  • What's with the SUBTAN column?
  • The SUBTAN column in the Rocs report file stands for the 'substructure Tanimoto'. The subtan is calculated by performing a shape superposition of 'fit' molecules onto a reference molecule (query molecule). Once superimposed, atoms in the 'fit' molecule that are greater than 2.5 angstroms away from the closest atom of the reference molecule are discarded, and a shape Tanimoto is calculated with the remaining atoms. The subtan score was an attempt to provide minimal facility to do superstructure shape searches. In some cases it is possible to find a superstructure of the query molecule in a database. In practice, this is an imperfect solution and subshape searching can often be done in better ways, such as via Tversky similarity.

  • What's the difference between the -chemff and -optchem options?
  • Both options involve the same chemical force field defined by smarts patterns in an external file. -chemff defines and includes a chemical force field for final scoring after optimization. The argument to -chemff is the file defining the force field. -optchem includes the defined chemical force field in overlay optimization, so the resulting aligned coordinates will be affected. Thus, -optchem requires -chemff to define the field. Using -chemff alone means optimization of alignment will be based on shape only.

  • Does ROCS consider hydrogens?
  • No. This is intentional, the prevailing judgment being that hydrogens do not contribute meaningfully to shape comparison in the context of ROCS. The OEShape Toolkit does provide this functionality.

  • Is the ROCS shape similarity calculation analytical or grid based?
  • Grid based. The OEShape Toolkit provides several levels of accuracy including a high-precision approximation to analytical. For speed ROCS uses the grid based shape comparison.

VIDA

    NOTE: pertains to Vida v2.0 unless otherwise noted.

  • Does VIDA require a license?
  • As of version 2.0, Vida utilizes the same licensing system as other recently released programs. A single common license file normally identified by environment variable OE_LICENSE is required. If a license file is not found, Vida will prompt the user for its location, and store that location for future use.

  • How do I set environment variable OE_DIR or OE_LICENSE on Windows?
  • On Win2k/WinXP, use Start->Settings->Control Panel->System->Advanced->Environment Variables

  • Resolving VIDA/Windows error: "The dynamic link library MSVCP60.dll could not be found."
  • MSVCP60.dll is the Microsoft Visual C version 6 runtime library, which is available publicly from Microsoft bundled in Vcredist.exe. See microsoft.com.

  • Where can I install a custom startup script?
  • With Vida 2.0, a mechanism was introduced whereby Vida-API (Python) commands may be invoked at startup. At present, this file must be named startup.py and must be located as follows:

    Windows:
    C:\Documents and Settings\username\OpenEye\VIDA2\startup.py
    Unix/linux:
    $HOME/.OpenEye/VIDA2/startup.py

    Note that the example pdbopen.py provided with Vida can be used as a startup script. This adds a menu button to download a PDB file by its ID.

  • Will Vida work with all video cards?
  • The answer is no. At this time, the following cards are known to be compatible:

    • NVidia (several)
    • ATI FireGL V5000

    These are not compatible:

    • ATI FireGL V3200

    For best results, use the latest available graphics driver software. Some graphics driver download sites:

    The amount of RAM that has been allocated to the video card should be 128 MB or more for best results. Most integrated graphics cards use system RAM instead of having their own video RAM. Video RAM should be a configurable parameter, on Windows, in the display settings of the control panel.

Filter

  • What's the difference between filter_lead.txt and filter_drug.txt?
  • For lead candidate filtering, filter_lead.txt is recommended, and corresponds with the default settings. It should filter things a modeler wouldn't want to show a medicinal chemist. filter_drug.txt is more discriminating and should filter everything that doesn't look like a known drug (including many useful lead compounds).

OEShape Toolkit

  • How are ROCS and the OEShape toolkit related?
  • ROCS is based on and built upon the OEShape Toolkit and uses the shape similarity measure in the toolkit API. The OEShape Toolkit includes source code examples which are essentially minimalistic ROCS implementations. As the time of this writing, the ROCS product is included when the OEShape Toolkit is licensed.

OEChem

  • What C++ compiler is required?
  • OEChem distributions are generally labelled clearly to indicate the corresponding compiler and version. Additional compiler support information can be found at the OpenEye Platform Support document, in the accompanying technical notes.

  • How should OEChem be installed?
  • As of OEChem 1.2, OEChem and other products are to be installed in a consistent way into a directory named "openeye" typically at /usr/local/openeye. See also Where/how should I install OpenEye software?

    No further installation is needed for the C++ package, but if code is to be compiled in user directories then /usr/local/openeye/oechem/include needs to be in the include path and /usr/local/openeye/lib/ needs to be in the link path. This can be accomplished simply by using the Makefile provided with the example code.

    For the Python "openeye" package further installation is required as described in the INSTALL file. Either $PYTHONPATH must be redefined to include the PyOEchem directory openeye/wrappers/python/ (recommended), or, the OpenEye directory must be copied recursively to the Python site-packages directory. Also, on unix/linux platforms $LD_LIBRARY_PATH must be redefined to include the openeye/wrappers/libs/ directory. On MacOS the environment variable is $DYLD_LIBRARY_PATH . On IRIX also set $LD_LIBRARYN32_PATH and $LD_LIBRARY64_PATH .

  • OEChem or OELib?
  • Matt Stahl explains the history and motivation behind OEChem: OEChem or OELib?

  • When reading a MDLfile, what does this mean: Warning: Stereochemistry corrected on atom number N?
  • The problem is that the configuration specified by up and down bonds around a tetrahedral center in a 2D depiction is potentially ambiguous or blatantly incorrect. An excellent review of the issues can be found in the CORINA manual: http://msdlocal.ebi.ac.uk/docs/chem_comp/corina.ps, p19-21. This "Stereochemistry corrected" warning is issued when OEChem takes a "best guess" at what the chemist intended for configurations that CORINA lists as "incorrect". The "Invalid stereochemistry specified on atom number N" is for an arrangement of wedge-and-hash bonds that even OEChem can't figure out whats going on. e.g. 4 up bonds from an atom.

  • What file formats does OEChem handle?
  • Here's the current status (OEChem 1.4.2):

    code ext format read? write?
    1 smi SMILES yes yes
    2 mdl MDL Mol yes yes
    3 pdb PDB yes yes
    4 mol2 Tripos MOL2 yes yes
    5 bin OEBinary v1 yes no
    6 tdt Daylight TDT no no
    7 ism Isomeric SMILES yes yes
    8 mol2h MOL2 with H yes yes
    9 sdf MDL SDF yes yes
    10 can Canonical SMILES yes yes
    11 mf Molecular Formula no yes
    12 xyz XYZ yes yes
    13 fasta FASTA yes yes
    14 mopac MOPAC no yes
    15 oeb OEBinary v2 yes yes
    16 mmod Macromodel yes yes
    17 sln Tripos SLN no yes
    18 rdf MDL RDF yes no
    19 cdx ChemDraw CDX yes yes
    20 skc MDL ISIS Sketch File yes no

  • What is the difference between the high- and low-level molecule file writers?
  • There is an important functional difference between high-level molecule writing method OEWriteMolecule() (or equivalently, the overloaded << operator in C++) and the low-level format-specific writers such as OEWriteMDLFile(). With the high-level writers, OEChem takes responsibility for normalizing the state of the molecule to correspond with the output format. For example, before writing an MDL file, OEChem will apply the MDL aromaticity model; before writing a canonical SMILES, OEChem will apply the Daylight aromaticy model, etc. The low-level writers, in contrast, require that the programmer apply these normalizations manually, and provide the flexibility to do otherwise, and control chemical content and file format independently. For example, it is possible to write a MDL file with the Tripos aromaticity model, or a Kekule canonical smiles. For details as to which normalizations are applied for each file format, consult the theory manual.

    A related issue is the roles of the I/O "flavors" which are specified by the SetFlavor() method for an oemolistream or oemolostream, and the "flags", which are specified via an optional argument to the low-level I/O functions. The low-level flags specify choices which must be made at the time of file I/O, such as whether PDB TER records delimit molecules or molecule components. The flavors do not affect the low-level functions, and their effects are a superset of the effects of the flags. Flavors additionally control molecule standardization functions such as aromaticity model perception.

    Low-level molecular input functions:

    • OEReadCDXFile
    • OEReadFASTAFile
    • OEReadMacroModelFile
    • OEReadMDLFile
    • OEReadMol2File
    • OEReadMolecule
    • OEReadMOPACFile
    • OEReadOldBinary
    • OEReadPDBFile
    • OEReadSketchFile
    • OEReadXYZFile

    Low-level molecular output functions:

    • OECreateAbsSmiString
    • OECreateCanSmiString
    • OECreateIsoSmiString
    • OECreateSlnString
    • OECreateSmiString
    • OEWriteCDXFile
    • OEWriteFASTAFile
    • OEWriteMacroModelFile
    • OEWriteMDLFile
    • OEWriteMOPACInputFile
    • OEWriteMol2File
    • OEWritePDBFile
    • OEWriteMolecule
    • OEWriteXYZFile

  • Windows and MS Visual Studio/C/C++: How do I get started?
  • First, be sure to see the README pertaining to Windows development available in the OEChem section of the download page, which describes the required compiler versions. In the MSVC distribution, the Makefile at openeye/examples/oechem/ contains the recommended compilation flags for use with MS Visual Studio/C/C++ (although most developers will not use make). As of OEChem 1.3.3, example project files are provided with the distribution.

    Note also that ZLIB must be installed; to obtain see zlib.org.

    Also: see Using OEChem and Ogham with Microsoft Visual Studio .NET [PDF]

  • Why doesn't OEGetSDData(mol,"foo") work on my OEMol?
  • OEMol objects, such as returned by GetOEMols, are multiconformer molecules. When an OEMol is read from an SDF file, the SD data is attached to the conformations, not the parent mol. So, this will work:

    for mol in ims.GetOEMols():
      for conf in mol.GetConfs():
        print "This conf name is", OEGetSDData(conf,"Name")

    In contrast OEGraphMol objects are single conformer. When dealing with single conformer files these should be used. Then this will work:

    for mol in ims.GetOEGraphMols():
      print "This molecule name is", OEGetSDData(mol,"Name")

    Note that this is true even when a "multiconformer molecule" only has one conformation, or conformations are actually 2D, or even are all zeroes.

    This may seem complicated, but this approach is consistent with OEChem's ability to handle multiconformer molecules across multiple formats.

  • Is OEChem thread safe?
  • OEChem is designed to be thread safe, i.e., re-entrant, but as of v1.4.0 there are known areas where OEChem is not re-entrant, namely, OEThrow, and generic data (e.g. SetData/GetData). In addition, C++ STL streams are normally not thread safe, so OEChem streams are also not thread safe. Multithreaded code can be written by selective use of OEChem, but the forbidden zones are not yet well established. At the very least, separate threads should not operate on the same molecules.

  • How should I get started using SMIRKS and OEChem reaction processing?
  • See the SMIRKS Primer.

OEWrappers - Python

  • Why Python?
  • Python is an object oriented interpreted language which fits well as a wrapper around the native C++ OEChem API. Almost all the C++ constructs and concepts translate directly into Python. Python is highly recommended for its power, convenience, and extensibility. If you haven't already, try it, you'll like it!

  • What Python version is required?
  • As of OEChem version 1.3.3 (May 2005), both Python 2.3 and 2.4 are supported, and Python 2.2 is dropped. Python packages are major version specific (1st decimal place), so installers should assure that a supported major version is installed. Minor versions should not require specific builds.

  • How can I run Py-OEChem from CYGWIN?

    Py-OEChem is not available for use with the Python bundled with CYGWIN. However, by installing "Python for windows" from python.org and running that executable (normally /cygdrive/c/python23/python.exe) then Py-OEChem scripts may be run from CYGWIN. The one caveat is that absolute file pathnames must be specified in DOS format -- with backslashes, etc. -- though relative pathnames can be in CYGWIN/UNIX format.

Lexichem

  • What is the difference between OpenEye, Systematic, and IUPAC names?
  • The core issue is that the IUPAC standards for chemical nomenclature are a moving target. They were first published in 1979, revised again in 1993 and currently undergoing final draft status for a third revision in 2005. As time goes by IUPAC attempts a form of social manipulation on geological timescales by deprecating common names that have been tolerated traditionally, moving towards more and more systematic names.

    The bullseye of this moving target is what is (in the 2005 standard) referred to as the PIN (preferred IUPAC name), which is the often the single name blessed by IUPAC for the compound. In addition, to this the standard allows what are called acceptable names, for example many of the names preferred in the earlier 1979 and 1993 standards now have acceptable status.

    The three categories you point out, (kind of) define the extremes of these IUPAC categories. The IUPAC name style in Ogham is intended to precisely follow the 2005 PIN. Future versions releases of Ogham may even have explicit IUPAC2005, IUPAC93 and IUPAC79 namestyles that'll implement those standards more accurately. The "Systematic" namestyle is the fully systematic name, and predicts where the IUPAC standards are headed. The "OpenEye" name style are IUPAC allowed names, that are more familiar to a chemist. Conceptually IUPAC names are closer to the names found in chemical supplier catalogs or the IUPAC93 and IUPAC79 standards. A good rule of thumb is that OpenEye names are typically always shorter than IUPAC names. Other than whether a name is still "allowed" by IUPAC, the border is more blurred between OpenEye names and traditional names (which are the final resting place for archaic names of historical interest).

    So for some examples of the differences:

    O "water" in OpenEye, but "oxidane" in IUPAC and Systematic.
    C#C "acetylene" in OpenEye and IUPAC, but "ethyne" in Systematic.
    *Nc1cccc1 "anilino" in OpenEye and IUPAC, but "phenylamino" in Systematic.
    *C(=O)C "acetyl" in OpenEye and IUPAC, but "ethanoyl" in Systematic.
    *O[N+]#[C-] "fulminato" in OpenEye, but "isocyanooxy" in IUPAC and Systematic.
    CC(=O)C "acetone" in OpenEye, but "propan-2-one" in IUPAC/Systematic.
    C12C3C4C1C5C4C3C25 "cubane" in OpenEye, but "BLAH" in IUPAC and Systematic.
    C(=O)O "formic acid" in OpenEye/IUPAC, but "methanoic acid" in Systematic.

    As an example of traditional names *S is "sulfanyl" in OpenEye/IUPAC/Systematic but "mercapto" in traditional.

    You can see that OpenEye names which are pretty much the names that you've been using until now are "reasonable" names. Naming water as "oxidane" and disallowing "indane" and "cubane" might push chemists tolerances to the limit. After all they are still prefered names in government agencies until the new standard is ratified and comes into effect (in 2005).

PVM and Parallelization

  • What factors affect the performance of OpenEye tools parallelized with PVM?
  • Several factors can degrade parallel scalability from the maximum possible which is N-times single-cpu speed where N = #cpus. The master node is devoted to traffic control and not computation so the first slave (second total CPU) provides no immediate advantage. The time required for communication between master and slaves affects performance adversely. This affect increases with: (1) more slaves, (2) a slower network, (3) a faster single-cpu program, and/or (4) less compact file formats. The point of diminishing returns therefore will be different depending on the program, input files, and network. In general, uncompact file formats such as SDF or MOL2 should not be used. Compressed OEBinary should be used if possible (.oeb.gz). In general, ROCS is faster than Omega, which is faster than FRED. So the number of slaves at this point of diminishing returns will increase in that order.

  • What is the meaning of these error messages?
  • Warning: Could not start pvm daemon on host 'foobar'
    Warning: Unable to launch slave on host foobar, no pvm daemon

    Often the cause of these errors is simply that the master is unable to connect with the slave via rsh or ssh (whichever is defined by $PVM_RSH). You should be able to run the command "rsh foobar 'echo $PVM_ROOT'". Another possibility is that PVM is not installed correctly.

    Warning: Slave 1 on host foobar died and could not be restarted

    This implies that the slave executable launched but then exited improperly. If error messages were generated these may be found on the slave in /tmp/pvml.<uid>.

  • What is the correct slave hostname?
  • By running the command line utility pvm (on the master), this answer can be found, and other diagnostics can be run.

    $ pvm
    pvm> conf
    conf
    3 hosts, 2 data formats
                        HOST     DTID     ARCH   SPEED       DSIG
          beavis.mypharma.com    40000   DARWIN    1000 0x0658eb59
                     butthead    80000    LINUX    1000 0x00408841
                         mork   140000 LINUXI386    1000 0x00408841
    
    pvm> add mindy
    add mindy
    0 successful
                        HOST     DTID
                        mindy Can't start pvmd
    
    Auto-Diagnosing Failed Hosts...
    mindy...
    Verifying Local Path to "rsh"...
    ...
          

    From this output we see that nodes butthead and mork must be specified with their simple hostname, whereas beavis requires its fully qualified hostname with domain. Using the add command tests and diagnoses the master-slave connection.

  • Installing PVM
  • (This answer is adapted from the Fred 2.2 manual.)

    PVM or parallel virtual machine is a freely available library for running processes on more than one processor on one or more machines. OE applications can take advantage of PVM to distribute docking jobs over multiple processors. To do this PVM must be installed on all the machines OE applications will be distributed over. The PVM source is freely available from

    http://www.csm.ornl.gov/pvm/pvm_home.html

    however many Linux distributions, and some Unix versions, include PVM by default. At the time of this writing (December 2006), OE applications are built with the PVM version 3.4.4, but should also work with PVM version 3.4.3. PVM is not supported for Windows.

    To use OE applications with PVM you must do one of the following:

    1. Place a link or copy of the OE application executable in $PVM_ROOT/bin/$PVM_ARCH
    2. Define the enviroment variable PVM_PATH, which names the directory in which the OE application executable resides.

    The environment variables PVM_ROOT and PVM_ARCH should be defined globally as part of the PVM installation. PVM_PATH is generally a user defined environment variable, and must defined for all shells (i.e., it may not be defined only in the shell from which OE application was launched).

    NOTE : There is no specific slave executable. The executable distributed for OE applications serves as both a master and slave PVM program as well as a single processor version.

    OE applications currently PVM-enabled:

    • eon
    • fred
    • omega
    • rocs
    • szybki

QuacPac

  • How is a canonical tautomer generated?
  • The program tautomers generates alternate tautomers and a canonical tautomer using methods developed over several years, by Roger Sayle and others. Currently the best references for these methods are:

    1. Canonicalization and Enumeration of Tautomers, EuroMUG '99, Cambridge, UK, October 1999.
    2. Hooked on Protonics, 224th ACS National Meeting, Boston, August 2002.

Chemistry

  • Anilinic nitrogen conformation: a notorious exception.
  • The handling of anilinic nitrogen, trivalent, three-connected and single bonded to an aromatic ring, is problematic, since it is neither fully sp3 nor sp2 hybridized, not planar with respect to its three attached atoms but not tetrahedral. This assertion is borne out to some degree by x-ray crystallographic data though crystal structures will reflect averaging of resonant chiralities. Since two R/S chiralities are possible, the planar conformation may be used and useful as an "average". This is the basis of the MMFF94S variant of MMFF94 (Halgren et al.). Since a bound ligand will tend to adopt one conformation this approach makes less sense for reproducing a bioactive conformation. OpenEye software, in general, will reflect this understanding of anilinic nitrogen, but may also allow for the planar model (e.g., MMFF94 and MMFF94S are both available).

    REF: "New Parameterization of the Cornell et al. Empirical Force Field Covering Amino Group Nonplanarity in Nucleic Acid Bases", Ryjacek, Kubar and Hobza, J. Comp. Chem. 24: 1891-1901, 2003.

RESOURCES

Knowledge Base

Documentation

Downloads

Bug report form

Supported platforms

<Back to top>

© 1997-2008 OpenEye Scientific Software
SEARCH: