4.4 Compressed Molecule Input and Output

For any of the molecular file formats supported by OEChem it is often convenient to read and write compressed files or strings. Molecule streams support gzipped input and output via the zlib library. The ".gz" suffix on any filename used to open a stream is recognized and the stream is read or written in compressed format. This mechanism does not interfere with the format perception. For instance, "fn.sdf.gz" is recognized as a gzipped file with MDL's SD format.

The following example demonstrates use of compressed input and output

 1 /**************************************************************
 2  * Copyright 2005, OpenEye Scientific Software, Inc.
 3  *************************************************************/
 4 import openeye.oechem.*;
 5
 6 public class CompressedFileIO {
 7   public static void main(String argv[]) {
 8     oemolistream ifs = new oemolistream();
 9     oemolostream ofs = new oemolostream();
10
11     if (ifs.open("drugs.sdf.gz") && ofs.open("drugs.oeb.gz")) {
12       OEGraphMol mol = new OEGraphMol();
13       while (oechem.OEReadMolecule(ifs, mol)) {
14         oechem.OEWriteMolecule(ofs, mol);
15       }
16     }
17     else {
18       System.err.println("Unable to open input or output file.");
19     }
20   }
21 }

Listing:4.5 Reading from and writing to compressed files

The example above converts all of the molecules in a gzipped SD format file into an OEBinary version 2 format gzipped file.