Skip to content

ProDy crashes while reading an mmCIF generated using OpenFold3 #2234

@rckormos

Description

@rckormos

Bug: parseCIF fails on valid mmCIF when _atom_site.auth_seq_id is . for non-polymer atoms

I’m parsing an OpenFold3-generated mmCIF with prody.parseCIF(...), with ProDy v2.6.1 installed via pip with Python 3.12.12, and get:

ValueError: invalid literal for int() with base 10: '.'

The failure occurs in ciffile.py when ProDy reads _atom_site.auth_seq_id and assumes it can always be cast to int.

In the mmCIF file:

  • polymer atoms have integer auth_seq_id
  • non-polymer ligand atoms (LIG0, chain B) have _atom_site.auth_seq_id = '.'
  • the file also contains _pdbx_nonpoly_scheme.auth_seq_num = 1 for that ligand

According to the wwPDB mmCIF dictionary, _atom_site.auth_seq_id is optional and “not necessarily a number,” so this seems to be a parser bug rather than a file-format violation.

Expected behavior:

  • ProDy should tolerate missing/non-numeric auth_seq_id values, especially for non-polymers, instead of unconditionally casting them to int.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions