The most common use of ROCS is overlaying a large collection of
molecules onto a query (reference) molecule. For the purposes of this
document, we'll call this large file the dbase (fit) file. The most common
format for the dbase file is a multi-conformer OEBinary file created
by OpenEye's OMEGA program, however, this file can be one of several
3D formats. These formats include SDF, MOL2 and PDB. Note that ROCS
determines the input file format from the file extension, .sdf
or .mol for SDF, .mol2 for MOL2,
.pdb or .ent for PDB. Gzip compressed files of these same
formats are allowed as well. ROCS will interpret infile.sdf.gz
as gzip'ed SDF file. Note that even though all these formats are supported,
using SDF or MOL2 can result in a loss of speed due to the huge
I/O penalty of these formats.
ROCS has no provision for conversion of 1D/2D molecules to 3D. The input file must already be 3D. More importantly, ROCS will interpret conformers in the input file as part of a single multi-conformer molecule as long as they:
-scdbase command-line switch. With the -scdbase switch on, ROCS
will not attempt to combine multiple conformers into a single
multi-conformer molecule.
A new molecule file format, specifically for ROCS on large PVM clusters
is the .rocsdb format. See Section 4.1 for when to use
this file and how to create it.
One other file type is allowed as the dbase file. A file name ending
in .list or .lst is assumed to be a list of actual
molecule files, one per line. ROCS will then open each in turn and
treat the entire collection as a single dbase file. Note that the
conformer detection/concatenation code above will not span the gaps
between these separate files.
Here is an example list file:
part1.oeb.gz part2.oeb.gz part3.oeb.gz hits.mol2
The second required input for a ROCS run is a file containing one or more molecules to be used as the query. ROCS will loop over molecules read in from the dbase file and attempt to overlay each of them against the query. In order to be consistent with other OpenEye software, this query molecule can also be referred to as the reference molecule.
Normally, ROCS treats each molecule in the query file as a single conformer molecule. For each molecule in the query file, ROCS will run a complete loop over the dbase molecules and write out a hits structure file and a report file, depending on the values of other command line switches described below.
Alternatively, ROCS can read queries as multi-conformer molecules by
adding the -mcquery command line switch. In this mode, ROCS
uses the same rules as described above to determine of two consecutive
molecules are actually conformers of the same molecule. For each
multi-conformer molecule in the query file, ROCS will loop over the
dbase molecules' conformers comparing them to all query conformers. By
default, ROCS will only return the single best overlay of this NxM set
of comparisons. More than one can be returned by using the
-maxconfs command line switch.
ROCS can also use a grid instead of a molecule as a query (reference). These grids must be in GRASP, OpenEye, OpenEye ASCII Grid (.agd), CCP4, or XPLOR grid format and can be created with the OpenEye Grid toolkit or with a graphical application like VIDA or GRASP. Certain ROCS features are not available when using a grid query. For example, the color force field features are not available with a grid query.