Subsections
34.7 OEChem 1.4.0
OEChem 1.4.0 is a major new feature release. OpenEye is introducing OEBio, a
new programming library extending OEChem's convenience in handling biopolymers.
In this initial release, OEBio's API is small but useful. Over the life of the
1.4.x OEChem release series the OEBio API will grow. The purpose of OEBio is
not to cover Bioinformatics, but to extend OEChem's strong cheminformatics
foundation to conveniently support protein modeling.
The source-code and examples in /openeye/examples/oechem have long been caught
in a conflict. They served both as very useful tools and as didactic coding
examples. To fufill the role as tools, they needed good
command-line-interfaces and error reporting. Unfortunately these features lead
to more complex code. To fufill a role as code examples, these programs need
to be as simple as possible, highlighting one or two programming principles.
In order to better serve both purposes, the example programs have now been
split into /openeye/utilities and /openeye/examples, the first includes
programs with more complex code and better interfaces and the latter with
simple OEChem code examples. In addition, nine new example programs have been
included to demonstrate common uses of the OEBio api.
In addition to OEBio, the 1.4.0 release includes many new features and bug
fixes in the OEChem, OESystem and OEPlatform libraries.
- Improved binary data handling in streams.
- Significant improvements for user convenience in licensing code will
allow future versions of OpenEye applications to manage licensing failures in a
friendly manner.
- New pipe streams (
oepstream) added to the beta public api.
- Fixed bug in cross-platform directory searching and checking for files on
a file system.
- Fixed bug in
oeigzstream::size that reported incorrect sizes in
some instances.
- Added the ability to detect moved home directories under Windows.
- Fixed bug that prevented reading the final molecule in a file and then
seeking to other positions in the file.
- Fixed a 64bit stream seek and read bug that could cause memory overflows
and crashes.
- Moved superpos and tensor2mat API points from OEChem to OESystem. Added
deprecated support for their use in OEChem.
- Added ability to assign an
OEIterBase<foo>* to an
OEIter<const foo> object. This allows much wider use of iterators of
const objects.
- Made
OEIter::Sort a stable sort.
- Additional physical constants added to
OEMath::OEConst.
- Added the ability to parse OEInterface parameter files without use of
command-line parsing.
- New
ELEMENT and FORMALCHARGE flavors for pdb writer.
ELEMENT adds the atomic symbol to columns 77-78 and FORMALCHARGE
add non-zero formal charges in columns 79-80.
- Extended the
ExtBonds option from the ``.smi'' writer to the
``.can'' and ``.ism'' writers.
- Fixed OEGrid and OEMultiGrid constructor bug that could cause no
memory to be allocated for the grid elements.
- Corrected behavior of
OEGrid::Clear to clear the OEBase
data, remove the title and reinitialize all the elements of the grid.
- Fixed rotation bug in inertial-frame alignment.
- Fixed bug in the atom index into coordinates used while calculating the
center of mass.
- Fixed bug in the calculation of
OEMultiGrid::SetSpacing and
OEMultiGrid::SetMid functions.
- Fixed OEInterface category name bug, !KEYLESS bug and unterminated
category bug.
- Protected the
OEIter::Sort function from NaN (not a number)
members.
- New support for highly compact ``rotor-offset compressed'' oeb files.
- Added support for MDL ISIS Sketch file format with the ``.skc'' suffix.
- Added support for writing hydrogens that are required for specifying cis-trans stereo.
- Added support for ``[Ds]'' and ``[Rg]'' in SMILES and SMARTS.
- Added support for writing high-atomic number atoms in SMILES using
``[#123]'' notation.
- New
OEWriteConstMolecule function class to support high-level
writing of const molecules. Introduced return-codes for the high-level writers
that reflect that some molecules are inherently not supported by certain file
formats (e.g. >999 atoms in .sdf).
- Add an
OEOFlavor::MOL2::Substructure high-level writer flavor to
force an ``@TRIPOS<SUBSTRUCTURE>'' idiom in the .mol2 file.
- New
OEHasStereoHydrogens(OEAtomBase *) function that determines if
an atom has a proton that is required to specify stereochemistry.
- Added
retainStereo=false default argument to
OESuppressHydrogens that keeps hydrogens indicated by the
OEHasStereoHydrogens function.
- Added
OEMatchBase::Clear function.
- Dramatically improved efficiency of
OEMCMolBase::DeleteConf for
deleting large numbers of conformers in order. Worst case behavior of the
algorithm was changed from
to
.
- Allow the SD file reader to handle a blank line between the ``M END'' and
the ``$$$$'' lines.
- Added convenience functions for getting and setting the MDL parity on
atoms.
- Added new bitmask initialization parameters to
OEInitDefaultHandler that allow easy specification of which handlers to
initialize.
- New support for ``h'', ``d'', ``t'', ``[T]'' and ``[t]'' non-standard
SMILES representations.
- Improved support for multiple NMR models in PDB files by reading,
retaining and writing model number.
- Added fully supported
OEPDBData and OEPDBDataPair classes
as well as the neccessary function to store and retrieve them from molecules.
- Three new convenience functions for clearing tag data:
OEClearTagData, OEClearSDData and OEClearPDBData.
- Added support for determining whether the library is properly licensed
with
OEChemIsLicensed funciton.
- Added
OEResidueHydrogens(OEAtomBase *) function that will rename
hydrogens an a heavy atom to their proper PDB atom names.
- Added PDBData readers and writers to OEBinary file handlers.
- Added defensive code to
OEMolBase::DeleteAtom and
OEMolBase::DeleteBond to confirm that the atom or bond are owned by that
specific molecule.
- Fixed rotation bug in intertial frame alignment.
- Converted inconsitent ``/'' and ``
'' into a warning rather than
an error, allowing the molecule to be parsed in a racemic fashion.
- Added an upper bound to the degree of the atoms at either end of a
cis-trans chiral double bond.
- Added defensive code to prevent creation of atoms with atomic number
greater than 255.
- Improved perception of non-aromatic exo double-bonds. This corrects a
problem perceiving the progesterone in pdb1a28.
- Improved the exo-cyclic double bonds to sulfur. This improves the
connectivity perception in 1hnv, 1rev, 1usn, 1uwb, 2usn, 3usn and 1zxv.
- Improved the bond order perception of notroso, oxime, azide, and
arylhydroxylamine functional groups.
- Improved bond order perception of clashed structures by allowing
hydrogens to only bond to their nearest heavy atoms.
- Prevent alternate conformation representations to be bonded to one
another during bond perception.
- Made Up/Down choice for the first stereo bond in each resonance system
canonical for writing isomeric smiles files.
- Made
OECanonicalOrderBonds also order the bonds obtained with the
OEAtomBase::GetBonds function call.
- Fixed bug in binary search for atomic number ``0'' used in
OEIsCommonIsotope, OEGetAverageWeight and
OEGetIsotopicWeight.
- Fixed the high-level pdb writer to preserve residue information found on
the molecule.
- Corrected
OEIsReadable to return false for the MOPAC file format.
- Added MOPAC flavors to the high-level molecule writers.
- Changed the hybridization assignment of negatively charged resonant
nitrogens such as
*S(=O)(=O)[N-]C(=O)*.
- Fix bug in
OESet3DHydrogenGeometry the could use a hydrogen's own
coordinates as a reference for determining its geometry.
- Fix ring perception bug in
OEMCSMaxAtomsCompleteCycles.
- Eliminate the redundancy between
OEMDLSetBondStereo and
OE3DToBondStereo by allowing OE3DToBondStereo to take an optional
bond mask and work on 2D as well as 3D molecules.
- Correct a bug in the OEChem interpretation of MDL wedge and hash bonds.
In MDL connection tables, wedges and hashes only imply a specified stereo-center
at the thin end (i.e. OEBondBase::GetBgn). This has been confirmed by
comparing the wedge/hash bonds with the atom stereo parity bit in MDL ISIS
output (including large vendor databases such as the entire Asinex 2005
collection).
- Fixed MDL reader bug where unrecognized atomic symbols would ignore
subsequent fields in the atom block such as stereo parity, reaction role and
valence.
- Added copy constructors and assignment operators to
OEMiniMols,
OEMiniBonds and OEMiniAtoms.
- Fixed a sign error in
OESetAngle.
- Added a length==0.0 check for
OESetDistance and OESetAngle.
- Fixed
oemolistream::seek and oemolistream::tell to take
into account any cached molecules that may exist in the stream.
- Fixed low-level MDL reader to accept multiple SD tags with the same tag.
Note: It is not clear from the SD file specification if this is a valid SD
format.
- Added
OESeqAlignment class with associated features for pairwise
sequence alignment (including PAM250, BLOSUM62 and GONNET), writing an
alignment to an oeostream and carrying out RMSD alignment between two
proteins based on the sequence alignment.
- Simple methods for accessing and manipulating the torsion angles of
biopolymers.
- Introduce classes that allow a hierarchical view of the Chains, Fragments
and Residues of a protein while maintaining the efficient OEChem internal data
structures.
- Added facility for swapping the terminal atoms of residues that are
commonly ambiguous in protein crystal structures (e.g. terminal N,O of
ASN).
- Added nine new example programs demonstrating the use of the new OEBio
api points. These include: backbone, cischeck, makealpha, phipsi, rescount,
reshist, seqalign, subsetres and swapaieres.
These examples show the best feature of OEChem. Though most are less than 100
lines of simple code they demonstrate protein-protein sequence alignment, 2D
and 3D structure manipulation, residue perception, robust multi-format I/O, stl
integration, canonicalization, chirality perception and manipulation and many
other complex cheminformatics tasks. While the main loop of each program is
often only 30 lines long, it brings to bear thousand of lines of OEChem code
and years of cumulative cheminformatics experience to easily combine 2D and 3D
structure analysis and manipulation.
- backbone.cpp: Code to show the use of functors to select and write the
backbond atoms of a protein.
- cischeck.cpp: Demonstrates how to loop over residues and checking the
omega torsion for cis amides.
- makealpha.cpp: A code example of protein structure manipulation. This
example modifies any protein into an alpha-helical structure with extended
side-chains.
- phipsi.cpp: Simple code to report the phi-psi angles of a protein.
- rescount.cpp: Demonstrates an easy way to loop over the residues of a
protein and query their information.
- reshist.cpp: Demonstrates and easy way to loop over a protein's residues
and integrate the aquired data into an STL ``dictionary'' class.
- seqalign.cpp: This is perhaps the most complex program of the examples.
It carries out protein-protein sequence alignment, alignment evaluation and
printing as well as 3D structural alignment.
- subsetres.cpp: Simple code example of how to pull a specific residue out
of a protein using its common name(e.g. ARG B 52).
- swapaieres.cpp: Demonstrates how a user can select a residue using its
common name (e.g. GLN 252) and swap the ambiguous iso-electronic atoms.
- Split the programs previously in the examples directory into examples and
utilities. The utilities directory will contain programs or versions of
programs that may be useful and convenient for modelers to carry out common
tasks. The examples directory will contain programs that may also be useful,
but there primary purpose will be to provide didactic code examples of how to
program common tasks using the OEChem library.
- Numerous additions to the API and Theory manuals.