Meta information about a molecule is stored in what is known as
``tagged data.'' The most common example of this is the data fields
found in SD files. Since SD files are a common form of data
storage and transfer from one system to another, OEChem provides
several methods to manipulate this data. A simple class,
OESDDataPair is used to set or retrieve
these pairs. OESDDataPair objects provide
SetTag /GetTag and
SetValue /GetValue methods for access to each half of the pair.
If you wish to store a numeric value, use Java's
String.valueOf() method to convert it to a string.
The following functions provide access to the SD data.
Use the OESetSDData method to set a tag and
value data pair. Both the tag and the value must be strings. If an
item with the same tag already exists, it is replaced. The second form
is the same as the first but uses an
OESDDataPair instance.
boolean OESetSDData(OEMolBase mol, String tag, String value) boolean OESetSDData(OEMolBase mol, OESDDataPair dp)
Use the OEAddSDData method to add a tag and
value data pair. Both the tag and the value must be strings. If an
item with the same tag already exists, another one is added. The
second form is the same as the first but uses an
OESDDataPair instance.
boolean OEAddSDData(OEMolBase mol, String tag, String value) boolean OEAddSDData(OEMolBase mol, OESDDataPair dp)
Use the OEHasSDData method to determine if a
molecule has an item with a given tag:
boolean OEHasSDData(OEMolBase mol, String tag)
Use the OEGetSDData method to get the value
for the given tag. If the molecule does not have that tag, an empty
string is returned.
String OEGetSDData(OEMolBase mol, String tag)
An OESDDataIter (iterator of
SDDataPairs ) can be used in a loop as shown
in the following example.
OESDDataIter OEGetSDDataPairs(OEMolBase mol)
Use OECopySDData to copy the entire set of
SD data from a source(src) molecule to a destination(dst) molecule.
boolean OECopySDData(OEMolBase dst, OEMolBase src)
Use OEDeleteSDData to delete a tagged
data item. All data items with the specified tag will be deleted.
boolean OEDeleteSDData(OEMolBase mol, String tag)
Use OEClearSDData to clear all SD data
from a molecule.
boolean OEClearSDData(OEMolBase mol)
The following example shows how to use the tagged data methods.
1 /*******************************************************************************
2 * Copyright 2005, OpenEye Scientific Software, Inc.
3 ******************************************************************************/
4
5 import openeye.oechem.*;
6
7 public class SDDataExample {
8 public static void main(String argv[]) {
9 OEGraphMol mol = new OEGraphMol();
10 oechem.OEParseSmiles(mol, "c1ccccc1");
11 mol.SetTitle("benzene");
12
13 // set some tag data
14 oechem.OESetSDData(mol, "color", "brown");
15 oechem.OESetSDData(mol, "size", "small");
16 oechem.OESetSDData(mol, "natoms", String.valueOf(mol.NumAtoms()));
17
18 // loop over data and print it out
19 for (OESDDataIter iter=oechem.OEGetSDDataPairs(mol);iter.hasNext();) {
20 OESDDataPair dp = iter.next();
21 System.out.println(dp.GetTag() + " : " + dp.GetValue());
22 }
23
24 // check for existence of a field and delete it
25 if (oechem.OEHasSDData(mol, "color"))
26 oechem.OEDeleteSDData(mol, "color");
27
28 // loop again
29 for (OESDDataIter iter=oechem.OEGetSDDataPairs(mol);iter.hasNext();) {
30 OESDDataPair dp = iter.next();
31 System.out.println(dp.GetTag() + " : " + dp.GetValue());
32 }
33 }
34 }
Note that SD tagged data is specific to MDL's SD file format. Any data added to a molecule will only be written out to SD files or OEBinary files. The SD data fields will only be filled when reading from SD files that contain SD tagged data or from OEBinary files previously created to contain this data.
Two more examples are provided specifically dealing with tagged
data. SDFRename.java takes an SD file and renames all the
molecules based on the value of a chosen SD tag. The other,
MergeCSV.java, takes a csv file and adds the data as
tags to molecules in an input stream. This simple program assumes that
the first column is the molecule title matching titles found in the
incoming molecule file. It also assumes the first row contains names
to be used as the tags.
The OESDDataPair class is also used to set
or retrieve PDB data pairs. In PDB files, this data is
stored in header lines where the first field is the tag and the
remainder of the line is the data. OESDDataPair objects
provide SetTag /GetTag and
SetValue /GetValue methods for access to each half of PDB pairs.
If you wish to store a numeric value, use Java's
String.valueOf() method to convert it to a string.
The following functions provide access to the PDB data.
Use OESetPDBData to set a tag and value
data pair. Both tag and value must be strings. If an item with the
same tag already exists, it is replaced. The second form is the same
as the first but uses an OESDDataPair instance.
boolean OESetPDBData(OEMolBase mol, String tag, String value) boolean OESetPDBData(OEMolBase mol, OESDDataPair dp)
Use OEAddPDBData to add a tag and value
data pair. Both tag and value must be strings. If an item with the
same tag already exists, another one is added. The second form is the
same as the first but uses an OESDDataPair instance.
Note that for PDB header items like REMARK, each line is treated as a
separate instance, so to add multiple REMARK lines be sure to use this
form instead of OESetPDBData.
boolean OEAddPDBData(OEMolBase mol, String tag, String value) boolean OEAddPDBData(OEMolBase mol, OESDDataPair dp)
To determine if a molecule has an item with tag:
boolean OEHasPDBData(OEMolBase mol, String tag)
Use OEGetPDBData to get the value for the
given tag. If the molecule does not have that tag, an empty string is
returned. Note that if there are multiple parts with the same tag,
this will only return the first instance. Using the iterator access
show below will allow retrieving multiple tags.
String OEGetPDBData(OEMolBase mol, String tag)
To get access to all PDB data, an iterator of OEBPDBDataPair
can be used.
OEPDBDataIter OEGetPDBDataPairs(OEMolBase mol)
To copy the entire set of PDB data from a source (src)
molecule to a destination (dst) molecule, use
OECopyPDBData .
boolean OECopyPDBData(OEMolBase dst, OEMolBase src)
Use OEDeletePDBData to delete a tagged
data item. All data items with the specified tag will be deleted.
boolean OEDeletePDBData(OEMolBase mol, String tag)
To clear all PDB data from a molecule, use
OEClearPDBData .
boolean OEClearPDBData(OEMolBase mol)
For using tag data with multi-conformer molecules, see Section 7.6.