-
Notifications
You must be signed in to change notification settings - Fork 744
Description
Is your feature request related to a problem?
OpenMM generate a PDB file with hexadecimal resid if there are more than 9999 residues.
However, in MDAnalysis, hexadecimal resids (treated as faulty resid) are set to 1, which causes problems in some cases.
Describe the solution you'd like
I'd like to change PDBParser._parseatoms to handle Hexadecimal resid, if there are alphabets in resid.
Changing
mdanalysis/package/MDAnalysis/topology/PDBParser.py
Lines 302 to 304 in c4af00b
| else: | |
| resid = int(line[22:26]) | |
| # Wrapping |
to
else:
if any([a.isalpha() for a in line[22:26]]):
resid = int(line[22:26], base=16) - 30960
else:
resid = int(line[22:26])
# Wrapping
Describe alternatives you've considered
Or setting unique resids (increment from last resid / preset failed resid starting number e.g. 10000) for residues with failed resid parsing can be another option (while this can be problematic for other cases).
Or editing Universe to get optional Parser(TopologyReaderBase) so one can give custom topology parser if needed.
Additional context
openmm.app.pdbfile.py : 484 (from OpenMM version 8.2.0)
def _formatIndex(index, places):
"""Create a string representation of an atom or residue index. If the value is larger than can fit
in the available space, switch to hex.
"""
if index < 10**places:
format = f'%{places}d'
return format % index
format = f'%{places}X'
shiftedIndex = (index - 10**places + 10*16**(places-1)) % (16**places) <----- This part
return format % shiftedIndex
One major problem is that OpenMM uses decimal until resid 9999 and starts from hex A000 (== decimal 40960) in place of decimal resid 10000, so parser needs to subtract 30960 or just use resid 40960.
I think both may cause problems in other parts.