This is a simple commandline utility to take an input database file and divide it into similar-sized smaller pieces. Then each piece can be used as a dbase file in a separate ROCS run. This divide- -and-conquer approach is an alternative way to run a single dbase over multiple CPU's but without the use of PVM.
The previous version of chunker only operated on .bin files.
This new version will chunk any of the ROCS molecule file formats (except
.lst and .rocsdb). All output files have exactly the same
format as the input file, with the exception of old OEBinary (.bin)
which is converted to OEBinary v2 (.oeb).
Note: Only one of -nchunks or -chunksize can be used.
To break input.oeb.gz into 5 chunks, each with the same number of
molecules.
prompt> chunker -in input.oeb.gz -base bar -nchunks 5
would create
bar0000001.oeb.gz bar0000002.oeb.gz bar0000003.oeb.gz bar0000004.oeb.gz bar0000005.oeb.gz
To break input.oeb.gz into chunks, each with 1000 multi-conformer
molecules:
prompt> chunker -in input.oeb.gz -base foo -chunksize 1000