The OEChem SMILES parsers support several minor extensions to Daylight syntax. Each of these extensions and its motivations are listed below.
OEChem may support ``Na'', ``Li'' and ``K'' as unquoted elements to support Syracuse SMILES at some point in the future.
[Pb:1]>>[Au:1]. However, OEChem
extends this notion to allow atom maps to be used in discrete molecules.
This is often useful for denoting significant sites or attachment points
in a molecule. Traditionally in SMILES, isotopes of element zero have
been used to perform role, however in OEChem both [*:1] and [1*] may be
used.
When external attachment points are paired within a SMILES string, they behave identically to ring closures, just using a separate index space. Hence, the SMILES ``c&1ccccc&1'' is interpreted the same way as ``c1ccccc1'', and ``C&1.C&1'' is interpreted like ``C1.C1'', i.e the SMILES ``CC''.
However, unlike ring closures, unpaired external attachment points are allowed and are interpreted like RGroup attachment points above. Hence, the SMILES ``CC&1'' (on its own) is equivalent to the RGroup attachment SMILES CC[R1], which is equivalent to the atom mapped molecule CC[*:1].
The major advantage of these semantics, inspired by Daylight's CHUCKLES, is that it allows convenient enumeration of combinatorial libraries using string concatenation. For example, three components of a library may be specified as ``C&1CCC&2'', ``F&1'' and ``Br&2''. The using the same notation ``C&1CCC&2.F&1.Br&2'' is interpreted as the reaction product, i.e. ``FCCCCBr''.
As with ring closures, bond orders may be specified after the ampersand and before the closure index, ``C&=1'', and two digit closures are indicated by a '%' prefix, i.e. ``C&%12'' or ``C&=%12''.