-
Notifications
You must be signed in to change notification settings - Fork 97
Description
Hello,
npTdms version is 1.9.0
For the sake of scalability/stability, I am doing Nidaq acquisitions with 100KHz sampling rate over 8 analog input channels over 1 hour. nidaqmx-python framework allows configuring a tdms logging and it creates a segment of the full acquisition for the 8 channels
So at the end, I have around 100K * 3600 * 8 samples in the segment. This is beyond max int32 value and npTdms won't read back the file.
I found the issue in _calculate_chunks(), data_size is a numpy int32, total_data_size is ~ 100K * 3600 8 * and there are computations like total_data_size % data_size and total_data_size // data_size. This is where the overflow error occurs
- a first fix was to use data_size.astype(int) instead of data_size in the computations (but when the file is not DaqmxData, it would fail as data_size would not be a numpy int32 so astype fails)
- another idea comes from the real reason: data_size is related to raw_data_widths slot from DaqMxMetadata class, which is defined as "np.zeros(raw_data_widths_length, dtype=np.int32)". Setting it to int64 is also fixing my issue
But I don't know the overall impact of any of these 2 changes.
Addendum: issue occurs when doing TdmsFile.open(file) or using tdmsinfo tool. I have seen people working with datasets of 10s of GBs so I assume they had big files but not big segments.
So reading by chunk should not help as it would fail at open