This section describes the three valence models currently implemented by OEChem.
The MDL valence model was developed for MDL for allowing hydrogen counts to be implicit in MDL SD and MOL file formats. It assumes that the bond orders to an atom are specified (explicit valence), and that the atomic number and formal charge are correctly set. The MDL valence model then prescribes the number of implicit hydrogens on a particular atom. The following table shows the MDL valence model as implemented in OEChem.
| At# [1] | Symbol | -3 | -2 | -1 | 0 | +1 | +2 | +3 | +4 | +5 |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | H | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| 3 | Li | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| 4 | Be | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 |
| 5 | B | 2 | 3,5 | 4 | 3 | 2 | 1 | 0 | 0 | 0 |
| 6 | C | 1 | 2 | 3,5 | 4 | 3 | 2 | 1 | 0 | 0 |
| 7 | N | 0 | 1 | 2 | 3,5 | 4 | 3 | 2 | 1 | 0 |
| 8 | O | 0 | 0 | 1 | 2 | 3,5 | 4 | 3 | 2 | 1 |
| 9 | F | 0 | 0 | 0 | 1 | 2 | 3,5 | 4 | 3 | 2 |
| 11 | Na | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| 12 | Mg | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 |
| 13 | Al | 2,4,6 | 3,5 | 4 | 3 | 2 | 1 | 0 | 0 | 0 |
| 14 | Si | 1,3,5, | 2,4,6 | 3,5 | 4 | 3 | 2 | 1 | 0 | 0 |
| 15 | P | 0 | 1,3,5,7 | 2,4,6 | 3,5 | 4 | 3 | 2 | 1 | 0 |
| 16 | S | 0 | 0 | 1,3,5,7 | 2,4,6 | 3,5 | 4 | 3 | 2 | 1 |
| 17 | Cl | 0 | 0 | 0 | 1,3,5,7 | 2,4,6 | 3,5 | 4 | 3 | 2 |
| 19 | K | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| 20 | Ca | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 |
| 31 | Ga | 2,4,6 | 3,5 | 4 | 3 | 0 | 1 | 0 | 0 | 0 |
| 32 | Ge | 1,3,5,7 | 2,4,6 | 3,5 | 4 | 3 | 0 | 1 | 0 | 0 |
| 33 | As | 0 | 1,3,5,7 | 2,4,6 | 3,5 | 4 | 3 | 0 | 1 | 0 |
| 34 | Se | 0 | 0 | 1,3,5,7 | 2,4,6 | 3,5 | 4 | 3 | 0 | 1 |
| 35 | Br | 0 | 0 | 0 | 1,3,5,7 | 2,4,6 | 3,5 | 4 | 3 | 0 |
| 37 | Rb | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| 38 | Sr | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 |
| 49 | In | 2,4,6 | 3,5 | 2,4 | 3 | 0 | 1 | 0 | 0 | 0 |
| 50 | Sn | 1,3,5,7 | 2,4,6 | 3,5 | 2,4 | 3 | 0 | 1 | 0 | 0 |
| 51 | Sb | 0 | 1,3,5,7 | 2,4,6 | 3,5 | 2,4 | 3 | 0 | 1 | 0 |
| 52 | Te | 0 | 0 | 1,3,5,7 | 2,4,6 | 3,5 | 2,4 | 3 | 0 | 1 |
| 53 | I | 0 | 0 | 0 | 1,3,5,7 | 2,4,6 | 3,5 | 2,4 | 3 | 0 |
| 55 | Cs | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| 56 | Ba | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 |
| 81 | Tl | 2,4,6 | 3,5 | 2,4 | 1,3 | 0 | 0 | 0 | 0 | 0 |
| 82 | Pb | 1,3,5,7 | 2,4,6 | 3,5 | 2,4 | 3 | 0 | 1 | 0 | 0 |
| 83 | Bi | 0 | 1,3,5,7 | 2,4,6 | 3,5 | 2,4 | 3 | 0 | 1 | 0 |
| 84 | Po | 0 | 0 | 1,3,5,7 | 2,4,6 | 3,5 | 2,4 | 3 | 0 | 1 |
| 85 | At | 0 | 0 | 0 | 1,3,5,7 | 2,4,6 | 3,5 | 2,4 | 3 | 0 |
| 87 | Fr | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| 88 | Ra | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 |
Table footnote:
[1] All the remaining elements not listed are assumed to have no implicit hydrogens.
In OEChem, the MDL valence model is used for calls to the OEAssignMDLHydrogens function.
The OpenEye formal charge model assigns formal charges to elements based upon their total valence. In OEChem, this functionality is invoked by the OEAssignFormalCharges function. If the formal charge on an atom is non-zero, it is left unchanged.
For the remaining elements, if the valence of an atom is zero, its formal charge is set from its partial charge.
OpenEye‘s hydrogen count valence model is used by OEChem when neither hydrogen counts nor valence are specified. The typical uses are reading molecules from PDB or XYZ format files without explicit hydrogens. This functionality is invoked by OEAssignImplicitHydrogens, which must always be followed by a call to OEAssignFormalCharges. This valence model is unique in that it only partially updates hydrogen counts, assuming the unfilled valences will be corrected by OpenEye‘s charge valence model above. In MDL’s model for example, a neutral sodium atom is assumed to have one implicit hydrogen, i.e. sodium hydride instead of sodium metal. In OpenEye‘s hydrogen count valence model, a disconnected sodium atom is assumed to be a sodium cation, [Na+]. When reading from PDB files, this is a very reasonable assumption.
Note that although the OpenEye hydrogen count valence model often sets charge and protonation states to physiological conditions, it is neither intended to be a pKa nor ionization state predictor. Instead, it is a normalization. Much like many registry systems and the MDL valence model which will convert C(=O)[O-] to C(=O)O for registration purposes, this valence model converts the opposite direction to C(=O)[O-].
All other elements are assumed to have no implicit hydrogens, and the formal charge as specified by the OpenEye charge model. This models all disconnected halogens as halide anions, and when disconnected the metals listed above as cations.
These rules are sufficient to reasonably protonate proteins read from PDB files. However, as described above, these rules are not intended to be a comprehensive rule-based pKa predictor. Users interested in predicting physiological ionization, and protonation/disassociation state enumeration should contact OpenEye Scientific Software (http://www.eyesopen.com/) about our tools for exactly this task.