Subsections

 
8.31 oemolistream

class oemolistream : public oemolstreambase

The OEChem oemolistream class provides a stream-like abstraction for reading molecules from files, strings or standard in (std::cin). The oemolistream maintains the format and flavor of molecular reading for the stream. It also manages the conversion between multi-conformer molecules and single conformer molecules in cases where the molecule read into is not compatible with the file format (in the sense of a multi-conformer file format being read into a single-conformer molecule, or a single-conformer file format being read into a multi-conformer molecule). The OEChem oemolistreams are capable of uncompressing gzip files while reading.

8.31.1 Constructors

oemolistream()
explicit oemolistream(const char *fname)
explicit oemolistream(const std::string &fname)
explicit oemolistream(OEPlatform::oeistream *istr, bool owned = true)

The oemolistream class supports several forms of constructor. The form without any arguments creates a new oemolistream that is connected to the processes standard in (std::cin). The forms that take a single string argument (either a const char* or a const std::string&) open the file specified by the given filename. The final form above, that takes a OEPlatform::oeistream can be used to create a new oemolistream from an exisiting oeistream. The second optional argument is used to indicate whether the new oemolistream now ``owns'' the given oeistream and is therefore responsible for closing and destroying it when it itself is closed and/or destroyed.

To associate a file or a stream with an oemolistream after it has been created, see the oemolistream::open method.

8.31.2 operator $>>$

bool operator >> (OEMolBase &mol)
bool operator >> (OEQMolBase &mol)
bool operator >> (OEMCMolBase &mol)
bool operator >> (OEMol &mol)
bool operator >> (OEGraphMol &mol)
bool operator >> (OEQMol &mol)

Read a molecule from an input oemolstream. The molecule is read from the input oemolstream in the file format currently associated with that oemolstream. This method is equivalent to the OEReadMolecule function. The return value indicates whether the read operation was successful.

This (high-level) method automatically clears the molecule before reading, skips empty or invalid molecules in the input stream. By default, it automatically calls OEFindRingAtomsAndBonds and OEAssignAromaticFlags to assign the ``in ring'' and ``aromatic'' properties of atoms and bonds as a convenience to the user. OEChem also contains low-level file I/O APIs that allow finer control over the variants of molecular file formats read and written. Access to these variants is also available via the SetFlavor method.

8.31.3 close

void close()

Close an input oemolstream. The oemolistream::close method may be safely called multiple times. This method is called from within the oemolstream destructor and therefore it is not necessary to call this explicitly under most circumstances.

8.31.4 GetFlavor

unsigned int GetFlavor(unsigned int format) const

Returns the file flavor associated with the format for the input oemolstream. The format arguments are a set of unsigned integers defined by the OEFormat namespace. The flavors are a set of unsigned integer bitmasks defined in the OEIFlavor namespace. A different set of bitmasks is defined and stored for each input format. The input flavor for any format can be set using the oemolistream::SetFlavor method. The default flavors are automatically set by the oemolistream constructors.

8.31.5 GetFormat

unsigned int GetFormat() const

Return the file format associated with an input oemolstream. The set of unsigned integer values valid for this property are defined by the OEFormat namespace. By default, when reading from standard in (std::cin), the associated file format is OEFormat::SMILES. The file format property of an input oemolstream may be set using the oemolistream::SetFormat method. Note that the file format property is also set automatically by oemolistream::open based upon the file extension of the specified filename.

8.31.6 Getgz

bool Getgz()

Returns whether the stream is reading from a gzip compressed oemolstream. This value can be altered with oemolistream::Setgz function.

8.31.7 open

bool open()
bool open(const char *fname)
bool open(const std::string &fname)

Open a file for reading with an input oemolstream. The fname argument specifies the filename of the file to be opened. The open with no arguments may be used to specify that the input oemolstream should read from standard in (std::cin). In this case the format defaults to SMILES. If an argument is used, open sets the file format property of the input oemolstream, based upon the extension of the given filename. If the file extension isn't recognized, a warning is issued and the file format is set to OEFormat::UNDEFINED. If the filename is appended with ``.gz'', the oemolistream will decompress it on-the-fly. The filename-based file format may be overridden by calling oemolistream::SetFormat explicitly with the desired file format. If only a file extension is used as the filename (``.oeb.gz''), then std::cin is opened with the format specified by the given extensions.

8.31.8 openstring

bool openstring(const unsigned char *buffer, unsigned int len)
bool openstring(const std::string &str)

The openstring methods of an oemolistream allow the input molstream to read from a buffer in memory, instead of from a file or standard in (std::cin). The buffer to be read from is specified either directly by a pointer to the input files contents and a length, or as an STL string, const std::string.

If the contents of the buffer have been compressed with gzip, the Setgz method should be called before calling openstring.

Internally, the openstring methods make a copy of the specified file contents, allowing the oemolistream to continue to function independently of whether the original buffer is later modified or deallocated.

8.31.9 SetConfTest

bool SetConfTest(const OEConfTest &)

Sets the functor class which is used to compare incoming graphs to determine whether they should be placed as conformers of a multi-conformer molecule or be returned individually as single molecules. The default conformer test never places separate graphs into a multi-conformer molecule. There are several pre-defined OEConfTest objects, including OEAbsoluteConfTest, OEIsomericConfTest and OEAbsCanonicalConfTest.

8.31.10 SetFlavor

bool SetFlavor(unsigned int format, unsigned int flavor)

Set the file flavor for a given format associated with this input oemolstream. The set of unsigned integer formats are defined by the OEFormat namespace. The set of unsigned integer bitmasks flavors are defined by the OEIFlavor namespace. The current flavor can be queried using the oemolistream::GetFlavor method. Each format has its own specific flavor which must be set separately. The oemolistream constructors set the flavors for all of the formats to their default state.

8.31.11 SetFormat

bool SetFormat(unsigned int format)

Set the file format associated with an input oemolstream. The set of unsigned integer values valid for this property are defined by the OEFormat namespace. By default, when reading from standard in (std::cin), the associated file format is OEFormat::SMILES. The file format property of an input oemolstream may be retrieved using the oemolistream::GetFormat method. Note that the file format property is also set automatically by oemolistream::open based upon the file extension of the specified filename.

8.31.12 Setgz

bool Setgz(bool gz)

Specify that the contents of the input molstream are to be treated as compressed by GNU gzip, and decompressed on-the-fly. Usually the ``gz'' property of an oemolistream is implied automatically from the file extension used to open the stream for reading. The current ``gz'' property of a oemolistream can be retrieved using the Getgz method.

8.31.13 seek

void seek(oefpos_t pos}

Moves the position of the next valid read to the position indicated. This function takes account of gzip streams and molecule caching.

8.31.14 size

oefpos_t size()

Fuction returns the size of the input stream if applicable to the current stream. The return type is a portable file-system pointer type.

8.31.15 tell

oefpos_t tell()

Returns the current position of the next read. This function accounts for molecular caching. Note: If you are reading an ``oeb'' file that was written as multiconformer molecules and is being read with single conformer molecules, all of the conformers are read into cache at once, and the pointer will point to the beginning of a multi-conformer molecule rather than to a conformer inside a molecule.