OpenEye Toolkits - Release Notes

OEToolkits 1.7.0

This is a new release of the OpenEye Toolkits with versions of the following libraries:

OEChem TK:1.7.0
Grid TK:1.3.2
Lexichem TK:1.9.0
Ogham TK:1.7.0
Omega TK:2.3.3
Quacpac TK:1.4.0
Shape TK:1.7.1
Spicoli TK:1.0.2
Szybki TK:1.3.4
Zap TK:2.1.1
  • This release starts the transition from the old LaTeX based documentation system to the new reStructuredText based system.
  • Starting with OEChem 1.7.0 the core OpenEye toolkits (OEPlatform, OESystem, OEChem, OEBio) are now considered “thread-safe”. The Thread Safety section in the OEChem theory manual goes into more depth of what it means to be “thread-safe”. Many minor changes were made to achieve this, only the major ones are listed in the release notes. No guarantee is made about any other library.

Note

The Python global interpreter lock is released around computationally expensive OEChem functions. Even so, the utility of threading in Python is still limited.

  • Many example programs have been renamed and cleaned up to provide better support across all three languages.
  • Documentation example code is given a descriptive name instead of an out of date chapter and section number.

OEChem 1.7.0

New features

  • Canonical isomeric SMILES generation has been significantly improved (OECreateIsoSmiString). On a test set of 9,962,003 compounds (4,025,817 with atom or bond stereo) OEChem 1.6.1 would generate different canonical isomeric smiles for 135,985 of the compounds based on random reordering of the atoms. This failure rate has been reduced to just 78 compounds, a 99.94% improvement. Furthermore, the generation has been optimized so that it is roughly 10-30% faster than the OEChem 1.6.1 algorithm.

  • OEReadMDLQueryFile has been added to read MDL query files into the OEQMol object. This allows for easy integration of MDL query files with the swath of OEChem tools based upon query molecules. The MDL query based substructure search was tested on a set of 655 query files.

    See also

    The Substructure Search with MDL Queries chapter in the OEChem theory manual.

  • OEChem also supports MDL reaction based library generation. Reaction file can be imported into a OEQMol object by calling OEReadMDLReactionQueryFile function. The OELibraryGen object then can be initialized with the imported reaction. The library generation was extensive tested on a set of 160 diverse reactions.

    See also

    The MDL Reaction Query File section in the OEChem theory manual.

  • The MiniMol implementation of OEGraphMol has been made more robust and optimized significantly for both speed and size. There used to be an arbitrary 1000 atom and bond limit, the limit is now a lot higher, 2^{15}. This implementation requires Compress be called on the molecule after construction to maximize space efficiency, but does not require UnCompress be called on it before it can be used. This makes it an ideal molecule implementation for in-memory substructure searching.

  • Added oemolithread and oemolothread for threaded molecule I/O. OEReadMolecule and OEWriteMolecule are thread-safe on oemolithread and oemolothread respectively.

    See also

    The Input and Output Threads section in the OEChem theory manual.

  • Added implementation of Zap 9 Radii from [Nicholls-2008] through the OEAssignZap9Radii function.

  • Added OEShortestPath function since it is an often-asked-for algorithm.

  • Added OEIdxSelected predicate for easy sub-setting of a molecule using an array of bool indexed by indices.

  • Added OECount function for easily counting atoms or bonds in a molecule based on arbitrary predicates.

Major bug fixes

  • Fixed seg fault in SMARTS parsing when the SMARTS contained the [<atomic mass>H<charge>] combination.

  • Fixed invalid address alignment crash in the sketch file reader on Sparc.

  • SD tag names were limited to 75 characters. This release raises the limit to 4096.

  • Several major changes are made in the library generation process in order to ensure that products are generated with a valid Kekulé form. If there is an explicit hydrogen in the product side of the input reaction, then OELibraryGen will add hydrogens to the generated products accordingly and the first Kekulization will be based on this reaction specification. If it is unsuccessful, i.e. OEKekulize returns false, then alternatives are tried by adding and removing implicit hydrogens from specific atoms until a valid (but arbitrary) Kekulé form is identified.

    See also

    The Product Kekulization section in the OEChem theory manual.

  • Significant number of improvements were added to OEChem‘s PDB file parser (OEReadPDBFile) in the area of atomic number determination. By default PDB atomic symbol field is used to determine the atom type. However, for a subset of “known” residues, including all of the amino and nucleic acids, we continue to use the atom name heuristics, which for this subset are more reliable. Currently, there are no conflicts/discrepancies between the PDB atomic symbol and the atomic number we perceive in the PDB file reader in the entire wwPDB repository.

  • The refinement of the hydrogen placement method (OESet3DHydrogenGeom) improved numerical stability and increased speed for most simple placement operations, special heuristics for carboxylic acids, toluene-like methyl rotors attached to aromatic rings, fleeing heuristics for alkanes and perpendicular support for allenic systems. Additionally, hydroxyl rotors are placed using a quick local scan for strong acceptors (alpha acceptors) within a 3.0 Angstrom radius. More hydrogen bond lengths data specifically to As, Ge, Se and Te was also added ([Sutton-1958]).

Minor bug fixes

  • OEGraphMol.operator= does the same thing as the OEGraphMol copy constructor. This allows OEGraphMol to be used directly inside STL containers without losing the selected OEMolBase implementation. Previous versions of OEChem would change the molecule implementation unexpectedly if the the STL container needed to relocate the object in memory. However, the following code changes meaning with the switch from 1.6.1 to 1.7.0:

    OEGraphMol gm1;
    OEGraphMol gm2(OEMolBaseType::OEDBMol);
    gm1 = gm2;
    // gm1 is now an OEDBMol implementation,
    // in 1.6.1 it would be the default implementation
    
  • The OEMolBaseType_OEDBMol implementation can now copy construct in compressed mode.

  • OEReadMolecule would sometimes return true when an empty molecule was present in the input file. The high level OEReadMolecule function will never return true when the molecule does not contain any atoms. Low-level molecule routines should be used if empty molecules are desired.

  • Stabilized the atom output order from successive calls to OEWriteMolecule to MOL2 when the molecule contained residues.

  • Deprecated the non-OE prefixed SmartsLexReplace function name. Renamed to OESmartsLexReplace.

  • Fixed the bug in substructure search that occurred when OESubSearch was initialized with a SMARTS string starting with a hydrogen atom (such as [#1]O[C,N,S,P]=O). The order of the atoms were reordered even when allowReorder parameter was set to be false.

  • When initializing an OESubSearch object with a SMARTS pattern the ‘reorder’ parameter is changed to false by default. This parameter is currently ignored i.e the atom order in the returned matches is always identical of the atom order in the SMARTS pattern.

  • Fixed a bug in OEGetAromatic and OEGetBondOrder to retrieve aromaticity/bond order from query expressions.

  • A restriction is added to the interpretation of SMIRKS in ‘strict’ mode. This requires that all atom maps in the reaction have to be pairwise when OELibraryGen is initialized with a SMIRKS string or OEQMolBase object.

  • DBREF, SEQADV, MTRIX1, MTRIX2 and MTRIX3 pdb data lines are now kept when parsing the file. The data can be accessed by OEGetPDBData.

  • Fixed the bug in OEWritePDBFile that caused writing the MODEL number into a wrong position in a pdb file.

  • Fixed the bug in the SMILES canonicalization process for special cases when the input SMILES contains R-Group information (such as [R1]c1ccc(cc1c2cccc(c2)[R3]).

  • Tweaked the OpenEye charge model such that the three valent Berylium has a negative one charge. This makes F[Be-](F)F equivalent to the charge separated form [Be+2].[F-].[F-].[F-] fixing 61 ligands in the PDB data set.

  • Improved the bond order perception (OEPerceiveBondOrders) support:

    • for arsenic acids, including the ‘cacodylate’ ion (example in 3DUE pdb).
    • for uric acid and related heterocycles.
    • for azobenzenes and similar compounds. All acyclic ‘nitrogen(2)-nitrogen(2)’ bonds now undergo a strict distance check independent of the (single) bond angle at each end. This corrects 1SRE, 1SRF and 2GBY pdb entries.
    • for benzoquinones and anthraquinones.
  • Addressed the problem in OEAssignAromaticFlags that caused long processing times when reading (OEReadMDLFile) on some pathological pseudo-fullerenes.

  • Added PDB support for the following:

    • sidechain recognition for the RNA residue ‘YG’ and ‘H2U’
    • naming of PDB residue ‘BME’
    • the N-terminal modification ‘FOR’
    • the cofactor ‘FMT’ (which is “formic acid” or “formate”)
  • The following problems were fixed in the PDB file parser (OEReadPDBFile):

    • ‘anomalous mercury’ problem for the residue ‘DVA’ (example in 2IZQ pdb)
    • spurious Holmium problem (in residues ‘CEH’ and ‘NGR’)
    • naming of the ‘ P ‘, ‘ O1P’, ‘ O2P’ and ‘ O3P’ atoms in the non-standard PDB residues ‘PTR’, ‘SEP’ and ‘TPO’.
  • Improved the initial partial charge parameterization (OEMMFF94InitialCharges) for selenium (atom type 83).

  • Improved the perception of reactions in ISIS Sketch files. Most importantly we now support the ‘rxnarrow’ object generated by recent versions of ISIS, such as MDL Draw and Symyx Draw. We also now allow the sketch to contain multiple lines, provided that only one has an arrow, and to allow the arrow direction and arrow co-ordinates to be specified in arbitrary order.

OEGrid 1.3.2

New features

  • The user is no longer required to call OEInitGridHandlers in order to attach grids to molecules and then write them out to OEB. This occurs at library link time now.

    See also

    The Generic Data section in the Grids chapter for details of attaching grids to molecules.

Major bug fixes

  • OEReadGrid would crash on gzipped files. It now properly uncompresses the data before reading it into a grid.

OESystem 1.7.0

New features

  • OERandom can now produce random integers through the OERandom.NextInt method.
  • Added an OEBitVector constructor which takes an OERandom for generating random bit strings. Added OEBitVector.operator method so OEBitVector can be used in STL containers.
  • Added OEBoundedBuffer and OEProtectedBuffer objects useful for communicating between threads in multi-threaded applications.
  • Added OEWallTimer since OEStopwatch actually reports CPU time so if multiple threads are being used it reports the added CPU time from each thread. Also added OECycleTimer for doing high precision timing based on clock cycles. For example, on x86 this uses the rdtsc instruction and is thus susceptical to its trade offs in comparison to using CPU time from OEStopwatch.
  • Added convenience constructor to OEInterface for the most common use case. Where a commandline needs to be parsed relative to a particular interface definition.

Major bug fixes

  • Deprecated OESetThreadSafe as it only gave the illusion of thread safety to the OEChem toolkits. Use OESetMemPoolMode instead as outlined in the Memory Management section. Most users should just ignore this issue altogether, as the defaults are sufficient (and have been optimized in 1.7.0). The user should only consider calling OESetMemPoolMode when passing OEChem objects between threads.
  • Fixed regression where OEBitVector.FromHexString would no longer recognize hex strings in lowercase. It should also be noted that OEBitVector.ToHexString encodes the fractional length of the last 4 bits as the last character.

Minor bug fixes

  • Added OEErrorHandler.Debug as another output option to match OEErrorLevel_Debug
  • OEErrorHandler now properly does nothing when OEErrorLevel_Quiet is passed.
  • Added OEUnaryTrue and OEBinaryTrue to match their inverses that already existed.

OEPlatform 1.7.0

New features

  • Added OELock class to do scoped locking around OEMutex objects. This ensures mutexes are released in the event of stack unwinding, such as when exceptions are thrown.
  • Added the following cross-platform threading primitives: OEThread, OEThreadLocal, OECondition, and OEOnce.
  • Added OEGetTimeOfDay to wrap gettimeofday on posix, but provide our own implementation on Windows since it doesn’t exist on Windows.
  • Added OEGetNumProcessors to return the number of cores available on the system.

Major bug fixes

  • Fixed rare off by one buffer overflow in oeogzstream initialization.
  • Fixed seek and tell on files greater than 4 gigabytes on Windows.

Minor bug fixes

  • The OEMutex implementation would default to a no-op implementation even when a valid mutex implementation was available through pthreads and compiling with a non-GCC compiler.
  • oeifstream will return false when oeistream.seek is called and return 0 when oeistream.size is called when the stream points to stdin.
  • Corrected instances where streams were using oefpos_t for memory operations and oesize_t for file operations. The rule is memory operations should use oesize_t (e.g. oeistream.read), and file operations should use oefpos_t (e.g. oeistream.seek).
  • Changing unsigned int arguments to oesize_t for oeisstream to be able to handle memory larger than 4 gigabytes on 64-bit machines.
  • The following functions were global scope: OEGetIPAddress, OEGetHostIdent, OEGetDomainName, and OEGetHostName. They are now located in the OEPlatform namespace.

Lexichem 1.9

  • On a benchmark of 250251 compounds in the NCI00 database, mol2nam is able to convert 234297 structures (93.62%) to names without BLAH. Of these 234297 names, nam2mol is able to convert 231566 (98.83%) back into structures.
  • This release includes a significant number of improvements to both name generation and name parsing. Several bugs have also been fixed. The name parsing conversion rate for the 71367 compound names in the 2003 Maybridge catalog is now up to 95.24%.
  • Several improvements have been made to the specification of CIP stereochemistry during name generation. For example, previously linking groups such as amidino, carbamimidoyl and diazenyl would forget to specify E/Z descriptors if they contained a chiral double bond with specified stereochemistry. We would also fail to place some chiral prefixes such as (E)-styrl and (Z)-cinnamyl in brackets which can lead to ambiguity when interpreting the generated name.

Ogham 1.7.0

  • Improvements have been made to the co-ordinate generation routines to consider stretching bonds in congested (per-substituted) chains, and to terminal atoms.
  • The generation of linear carbon atoms is now avoided, even in strained systems such as methanedisulfonic acid. Depending upon the presence of heteroatomic neighbors and/or the use of color, such linear carbons were ambiguous, potentially being interpreted as stretched bonds. Instead, the central atom remains crinkled, but the bonds to it are elongated to avoid overlaps.
  • To avoid potential ambiguities with linear carbon atoms, the rendering routines now display a label on linear carbon atoms, typically CH2. Similar functionality already existed to display a label in allenic systems.
  • Support for using an anti-aliased bitmapped font for atom labels and titles has been added to the default 8-bit image renderer, OE8BitImage. This replaces the vector fonts used previously. Text functions now use a proportionally spaced sans serif font.
  • New SetAntiAliased and GetAntiAliased methods have been added to the OE8BitImage class to control the use of (text) anti-aliasing. Although, anti-aliasing is enabled by default, this can create edge artifacts when generating transparent bitmap images, if the display background color doesn’t match the background color used for anti-aliasing. In such cases, SetAntiAliased(false) can be used to explicitly turn off antialiasing.
  • The rendering of atom labels on free (disconnected) atoms has been improved, to avoid being displaced by a hydrogen count in the label. Previously, the presence of the hydrogen count in H2O would cause picking and highlighting operations to emphasize the H instead of the O.
  • Improvements have been made to the algorithm for choosing which side of a double bond to display the second line. This improves the display of molecules like [18]annulene.
  • Many new macro-cycle ring systems have been added to the ring template dictionary, and existing entries have been updated with substitution preferences.

OmegaTK 2.3.3

New features

  • New API points have been added: OETorLib.ResetTorsionLibrary, OETorLib.ClearTorsionLibrary, and OETorLib.AddTorsionRule. The torsion rules may be created by passing in a QMolBase and vector<int> of angles, or by passing in a string with the standard rule format.
  • The corresponding atom symbols, bond symbols, and torsion angles are now printed on a line under the matching smarts pattern.

Bug fixes

  • Rotor offset compression is now set to false by default. Coordinate changes made in the toolkits after an Omega calculation are lost if rotor offset compression is set to true. If molecules are sent to OEWriteMolecule directly after an Omega run then rotor offset compression may safely be set to true, which will significantly reduce the filesize of the output.
  • The energy window set by OEOmega.SetEnergyRange and OEOmega.SetRangeIncrement is now used consistently. Previous versions of Omega did not use this value properly for all aspects of an Omega calculation.

QuacPacTK 1.4.0

New features

  • OESetNeutralpHModel has been added. This function may be used to set a molecule to an energetically favorable ionization state for pH=7.4. This is the same pH model that was available in the Filter application. Additionally, the perception of acceptable valence states has been improved to include phosphorus as well as aromatic oxygen and sulfur with +1 formal charge.
  • OEEnumerateFormalCharges now implicitly uses the mostAro=true parameter on the OETautomerMolFunction it uses when choosing a tautomer. This provides a better reference tautomer when enumerating pKa states.
  • Aromaticity settings on molecules are now unchanged when OEAssignPartialCharges is called. Aromacity may be temporarily changed inside the function while charges are being calculated, but the molecule will have the same aromaticity after the function as before.

ShapeTK 1.7.1

New features

  • Added new types in OEBOOrientation. All of these are designed to provide a more deterministic search over the reference molecule for those case where the size of the fit molecule is much smaller than the reference, for example, when trying to match a fragment into part of a reference molecule.
    • OEBOOrientation_InertialAtHeavyAtoms moves the center of mass of the fit molecule to each reference molecule heavy atom and performs 4 inertial starts at the position. This results in many more starting positions, but provides a more direct way to search over an entire reference molecule, without resorting to random starts.
    • OEBOOrientation_InertialAtColorAtoms performs a similar search as above, but just moves to the location of each reference molecule color atom.
    • OEBOOrientation_UserInertialStarts, used in conjunction with OEBestOverlay.SetUserStarts allows the user to pick specific points in space to perform the 4 inertial starts.
  • Fixed a bug when calculating Tanimoto while using a grid as the reference object.
  • Added more functions to manipulate the color atoms on a molecule. These include the ability to add color atoms one at a time (OEAddColorAtom) and the ability to get an iterator of color atoms from a molecule (OEGetColorAtoms).
  • Added a pair of functions (OEColorAtomsToString and OEStringToColorAtoms) that allow converting the color atoms of a molecule into a compressed string representation (that is attached to the molecule) and the to restore the actual color atoms from that string.

Bug fixes

  • Fixed a bug that could cause a crash when passing an empty molecule into OECalcVolume or OECalcShapeMultipoles.
  • Fixed a bug that could cause a crash when passing large molecules to OECalcVolume.

SpicoliTK 1.0.2

New features

  • The user is no longer required to call OEInitSurfaceHandlers in order to attach surfaces to molecules and then write them out to OEB. This occurs at library link time now.

Minor bug fixes

  • OESurface.operator= now copies the OEBase data.
  • OEMakeCliqueSurface and OESurfaceCropToClique now preserve OEBase data.

SzybkiTK 1.3.4

New features

  • Calculation of the constant potential terms at the end of adapted optimization can be optionally eliminated, by the usage of the function void SetCalculateFrozenTerms(bool). This feature reduces memory and cpu usage in the case of partial optimization of large molecules.

Bug fixes

  • Member function ClearFixAtoms() did not cleared properly previously fixed atoms. As a result, in spite of its use some atoms were still fixed. This bug has been fixed.
  • A bug which occasionally caused errors in fixing a subset of atoms was fixed.

ZapTK 2.1.1

Bug fixes

  • OEZap.GetMolecule and OEZap.GetFocusTarget now return a pointer instead of a reference. This is a breaking change and any code that calls these methods will have to be updated. These methods will return a null pointer if an acceptable molecule has not been set for them.
  • A warning has been added for molecules passed into ZapTK that do not return 3 when GetDimension is called. The dimension must be 3 for all molecules passed into ZapTK. The dimension is set automatically when using OEReadMolecule but a warning will be thrown and the molecule will not be accepted by ZapTK if the molecule has been made from scratch in the toolkits and SetDimension(3) has not been called on it.

Minor bug fixes

  • Fixed a memory bug related to extremely large files on Windows.