On a benchmark of 250251 compounds in the NCI00 database, mol2nam
is able to convert 234155 structures (93.57%) to names without BLAH.
Of these 234155 names, nam2mol is able to convert 223246 (95.34%)
back into structures.
This release includes a significant number of improvements to both name generation and name parsing. Several bugs have also been fixed. The name parsing conversion rate for the 71367 compound names in the 2003 Maybridge catalog is now up to 93.81%.
A new OELowerCaseName function has been added to the Lexichem
toolkit API. This function converts the input chemical name to
lower-case, whilst preserving the case sensitive aspects of IUPAC
names. This functionality allows uppercase and mixed case names to
be translated into English, as the OEFromFoo functions
assume their input is lowercase. For example, this feature allows
"AGUA" to be recognized via OEFromSpanish.
A new OEReorderIndexName function has been added to the
Lexichem toolkit API. This function attempts to reorder the
given permuted index name into a form that can be handled by the
OEParseIUPACName function. For example, this will convert
the string "benzene, chloro-" into "chloro-benzene".
A number of improvements and bug fixes have been made to Lexichem's naming styles. For example, AutoNom and CAS permuted index styles are now far more AutoNom-like and CAS-like respectively. Naming of metallocenes and fullerenes is much improved.
Some dramatic improvements have been made with foreign language support. On the 250251 compounds in the NCI00 database mentioned above, we now round-trip 100% to German and back without any differences. Japanese, Spanish and Swedish rates are all currently above 99%. Support for Hungarian and Polish has been dramatically improved.