Subsections

 
2.2 Usage

 
2.2.1 Command Line Interface

A description of the command line interface can be obtained by executing tautomers with -help all.

prompt> tautomers --help all

will generate the following output:

Complete parameter list
    tautomer
      -all : Enumerate all (up to level 5) tautomers
      -can : Write tautomers in canonical SMILES
      -ch3 : Tautomerize alpha methyl/methylene groups
      -count : Only count rather than generate the states
      -in : Input filename
      -kekule : Write Kekule structures
      -level : Acceptable pseudo-enegy level of tautomers
      -max : Maximum number of tautomers per molecule
      -out : Output filename
      -param : parameter file of tautomers settings
      -paramfile : Parameter filename for output
      -prefix : Prefix to use to name output files
      -reasonable : Returns a reasonable looking uniqie tautomer
      -uniq : Return a single unique tautomer

Command line options are distinguished from real filenames by having a `-' prefix. Options can appear anywhere on the command line, i.e. before, after or in between filenames. When incompatible options are specified the last one given on the command line takes effect.

The first filename given on the command line is taken to be the input file, and the optional second filename is treated as the output file. A minus character may be used in place of the input filename to specify that the input is to be read from standard input, stdin, and in place of the output filename to specify that the output is to be written to standard output, stdout. If only one filename is specified on the command line, the output is written to stdout by default.

 
2.2.2 Command Line Options

-all
Generate all tautomers, including high-energy tautomers and methyl-group tautomerism. By default, tautomers enumerates just the lowest energy class of tautomers for each input structure. This option s equivalent to the command line options -level 5 and -ch3.

-a
Synonym for -all.

-can
Output the OpenEye canonical SMILES for each tautomer. By default, tautomers writes arbitrary SMILES where the order of the atoms in each tautomer is the same for a given input molecule, which often makes it easier to see the differences between tautomers when reading the SMILES.

-c
Synonym for -can.

-ch3
Include tautomerisation across methyl and methylene groups that are alpha to a conjugated system. This permits structures such as cyclohexa-2,4-dien-1-one to be considered a tautomer of phenol, and other examples of keto-enol tautomerism. This option is implied by the -all command line option. Unfortunately, the -ch3 flag may cause the calculation time to increase by many orders of magnitude for some molecules.

-count
Output just the number of tautomers per compound rather than listing the tautomers.

-C
Synonym for -count.

-in
The input filename for reading in molecules.

-kekule
Output is in Kekulé format for .smi, .ism. and .can. Often, disabling aromaticity makes it easier to understand the differences between tautomers.

-level $n$
Manually set the acceptable approximate tautomer energy level to use in enumeration. This takes an integer value between zero and six inclusive, where zero corresponds to lowest energy states and six corresponds to the highest energy state. The -all command line option increases this value to at least five. By default, the tautomers program enumerates all tautomers in the lowest non-empty low energy state; first trying level zero and if no tautomers are found increasing to one, then two and so on.

-l
Synonym for -level.

-max $n$
Specify a maximum number of tautomers to enumerate for a single input structure. Over 99% of compounds require less than 100 tautomers, indeed most organic compounds gave only a single tautomer, however some pathological dyes and chromophores may individually have nearly a million possible tautomeric forms. The current default is a limit of 1000 tautomers per input structure.

-out
Output filename, where the file extension indicates the output format.

-param
This can be used to name an input parameter file of tautomers settings, especially useful for running the program with similar flags as a previous run. Flags coming after -param override parameters set by the parameter file.

-paramfile
The filename for the output parameter file can be set with this flag.

-prefix
Similar to -paramfile, the parameter file will be written to what follows this flag plus the extension `.param'.

-reasonable
Only for those who want a unique tautomer which is reasonable looking. The program will output the most aromatic tautomer of the first 64 attempted.

-uniq
Run in canonicalization mode, where for each molecule in the input file, a single ``canonical'' tautomer is written to the output file. By default, tautomers runs in enumeration mode.

-u
Synonym for -uniq.

 
2.2.3 Example Executions

Consider the following input file, guanine.smi, that contains just the following line.

c1[nH]c2c(=O)[nH]c(nc2n1)N

The following command can be used to enumerate all of the reasonably low energy tautomers of this structure.

prompt> tautomers guanine.smi output.smi

Which should write the following 15 structures to the file output.smi.

c1[nH]c2c(=O)nc([nH]c2n1)N
c1[nH]c2c(=O)[nH]c(nc2n1)N
c1nc2c(=O)nc([nH]c2[nH]1)N
c1nc2c(=O)[nH]c(nc2[nH]1)N
c1[nH]c2c(=O)[nH]c(=N)[nH]c2n1
c1nc2c(=O)[nH]c(=N)[nH]c2[nH]1
c1[nH]c2c(nc(nc2n1)N)O
c1nc-2c(nc([nH]c2n1)N)O
c1nc2c(nc(nc2[nH]1)N)O
c1nc-2c([nH]c(nc2n1)N)O
c1[nH]c2c(nc(=N)[nH]c2n1)O
c1[nH]c2c([nH]c(=N)nc2n1)O
c1nc2c(nc(=N)[nH]c2[nH]1)O
c1nc-2c([nH]c(=N)[nH]c2n1)O
c1nc2c([nH]c(=N)nc2[nH]1)O

A more exhaustive enumeration of the tautomers of guanine can be performed using the -all (or -a) option.

prompt> tautomers -a guanine.smi output.smi

which will write a total of 300 guanine tautomers to the file output.smi. This can be verified using the -count command line option, sending the output to the screen.

prompt> tautomers -count -all guanine.smi

300

Finally, the canonicalization abilities of the tautomers program can be assessed by canonicalizing all 300 of the guanine tautomers generated above. The example below uses the UNIX command line utility ``uniq'' which counts the number of repeated lines in a file. By piping the output of tautomers with the -u flag to UNIX command uniq, the output below confirms we generate 300 identical copies of the SMILES string ``c1[nH]c2c(=O)nc([nH]c2n1)N''.

prompt> tautomers -u output.smi | /usr/bin/uniq -c

300 c1[nH]c2c(=O)nc([nH]c2n1)N