A.5 Lexichem v1.4, February 2006

On a benchmark of 250251 compounds in the NCI00 database, mol2nam is able to convert 221254 structures (88.41%) to names without BLAH. Of these 221254 names, nam2mol is able to convert 192345 (86.93%) back into structures.

Lexichem v1.4 is predominantly a maintenance to provide a version of the oeiupac library that is compatible with OEChem v1.4. However, there have been a number of significant improvements to name parsing, and minor improvements to name generation since last month's v1.3 release.

This release also includes the ability to generate compound names in several languages. In addition, to British spellings, Lexichem can now generate German, Italian, French, Spanish, Swedish, Dutch and Polish names. Whilst the translations for German, Italian, Swedish and Polish are quite comprehensive, those for French, Spanish and Dutch are less complete.

A potential ambiguity with the ring names ``oxazole'' and ``thiazole'' has also been resolved. The IUPAC documentation states that it is permissible to omit locants from Hantzsch-Widman names when the locants are consecutive, i.e. 1,2,3,4-tetrazole may be written as tetrazole, and 1,2-oxazirene is preferred as oxazirene. Unfortunately, this conflicts with the traditional interpretations of oxazole as meaning 1,3-oxazole and thiazole as 1,3-thiazole. Instead the traditional names isoxazole and isothiazole denote the 1,2- forms. This ambiguity, that affected IUPAC-style (but not OpenEye-style) names, has been resolved by preserving the locants, so that the IUPAC names 1,2-oxazole, 1,3-oxazole, 1,2-thiazole and 1,3-thiazole are now generated for isoxazole, oxazole, isothiazole and thiazole respectively.