Skip to content

Commit d036d2a

Browse files
authored
BUG: large DCD file seeking on Win (#5086)
* BUG: large DCD file seeking on Win * Fixes gh-4879 * 32-bit signed integer arithemetic was being used in the DCD trajectory seeking Cython logic on Windows, causing an overflow for trajectories with a large number of frames on that platform. The behavior has existed since at least MDAnalysis `2.5.0`, and a regression test and patch have been added here. The regression test is based on the original scripts Oli provided in the matching ticket, though I note in a comment that we likely don't actually need a large file, and could probably simplify the regression test by artificially setting the frame numbers to start from a large value. Nonetheless, this test does fail before and pass after the patch locally on Windows. * DOC: PR 5086 revisions * Update `CHANGELOG` based on reviewer request. [ci skip] [skip ci]
1 parent d412c9a commit d036d2a

File tree

3 files changed

+23
-3
lines changed

3 files changed

+23
-3
lines changed

package/CHANGELOG

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,12 +16,14 @@ The rules for this file:
1616
-------------------------------------------------------------------------------
1717
??/??/?? IAlibay, orbeckst, BHM-Bob, TRY-ER, Abdulrahman-PROG, pbuslaev,
1818
yuxuanzhuang, yuyuan871111, tanishy7777, tulga-rdn, Gareth-elliott,
19-
hmacdope
19+
hmacdope, tylerjereddy
2020

2121

2222
* 2.10.0
2323

2424
Fixes
25+
* Fixed an integer overflow in large DCD file seeks on Windows
26+
(Issue #4879, PR #5086)
2527
* Fix compile failure due to numpy issues in transformations.c (Issue #5061, PR #5068)
2628
* Fix incorrect `self.atom` assignment in SingleFrameReaderBase (Issue #5052, PR #5055)
2729
* Fixes bug in `analysis/hydrogenbonds.py`: `_donors` and `_hydrogens`

package/MDAnalysis/lib/formats/libdcd.pyx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -387,13 +387,13 @@ cdef class DCDFile:
387387
raise EOFError('Trying to seek over max number of frames')
388388
self.reached_eof = False
389389

390-
cdef fio_size_t offset
390+
cdef long long offset
391391
if frame == 0:
392392
offset = self._header_size
393393
else:
394394
offset = self._header_size
395395
offset += self._firstframesize
396-
offset += self._framesize * (frame - 1)
396+
offset += self._framesize * (<long long>frame - 1)
397397

398398
cdef int ok = fio_fseek(self.fp, offset, _whence_vals['FIO_SEEK_SET'])
399399
if ok != 0:

testsuite/MDAnalysisTests/formats/test_libdcd.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,13 +34,15 @@
3434
assert_almost_equal,
3535
)
3636

37+
import MDAnalysis as mda
3738
from MDAnalysis.lib.formats.libdcd import (
3839
DCDFile,
3940
DCD_IS_CHARMM,
4041
DCD_HAS_EXTRA_BLOCK,
4142
)
4243

4344
from MDAnalysisTests.datafiles import (
45+
PSF,
4446
DCD,
4547
DCD_NAMD_TRICLINIC,
4648
legacy_DCD_ADK_coords,
@@ -671,3 +673,19 @@ def test_write_random_unitcell(tmpdir):
671673
with DCDFile(testname) as test:
672674
for index, frame in enumerate(test):
673675
assert_array_almost_equal(frame.unitcell, random_unitcells[index])
676+
677+
678+
@pytest.mark.skipif(
679+
not os.environ.get("LARGEDCD", False), reason="Skipping large file test"
680+
)
681+
def test_gh_4879(tmpdir):
682+
# NOTE: we really only need a trajectory with a frame
683+
# count that is large enough to overflow a 32-bit signed
684+
# integer in the DCD frame seeking arithmetic to reproduce
685+
# the original issue
686+
u = mda.Universe(PSF, 800 * [DCD])
687+
with tmpdir.as_cwd():
688+
f = "large.dcd"
689+
u.atoms.write(f, frames="all")
690+
u = mda.Universe(f)
691+
u.trajectory[-2]

0 commit comments

Comments
 (0)