Skip to content

[io] Properly abort when buffer size overflows max integer or size > maxBufferSize #19606

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

ferdymercury
Copy link
Collaborator

This Pull request:

Changes or fixes:

Fixes #14770

And is a first step towards #6734

Checklist:

  • tested changes locally
  • updated the docs (if necessary)

@ferdymercury ferdymercury changed the title [io] Properly abort when buffer size overflows max integer [io] Properly abort when buffer size overflows max integer or size > maxBufferSize Aug 11, 2025
@ferdymercury ferdymercury added the clean build Ask CI to do non-incremental build on PR label Aug 11, 2025
@ferdymercury ferdymercury removed the clean build Ask CI to do non-incremental build on PR label Aug 11, 2025
@pcanal
Copy link
Member

pcanal commented Aug 11, 2025

Can you make a separate PR with only the change in variable name (bufsize) to reduce the noise of this more challenging PR.

Copy link

github-actions bot commented Aug 12, 2025

Test Results

    20 files      20 suites   3d 6h 24m 9s ⏱️
 3 376 tests  3 374 ✅ 0 💤 2 ❌
65 866 runs  65 860 ✅ 0 💤 6 ❌

For more details on these failures, see this check.

Results for commit eb0dc82.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@jblomer jblomer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, many thanks!

I think generally we want to use std::size_t for buffer sizes instead of Long64_t.

We should probably also update TBuffer::[Read|Write]Buf, TBuffer::ReadString, TBuffer::MapObject, TBuffer::[Check|Set]ByteCount, TBuffer::SetBufferDisplacement.

Maybe also TBuffer::[Read|Write]Clones.

Sometimes we are checking for kMaxBufferSize, sometimes for kMaxInt. Shouldn't we always check for kMaxBufferSize?

Regarding commit messages, I'd suggest

[NFC] remove unused headers

and

[io] accept and check long buffer size params

with an explanation why we (at this point) allow for long buffer sizes but then abort when they are actually used.

@pcanal: do we have an indication that the optimization of initializing the buffer size to the average buffer size seen so far in the file is actually useful? There are certainly write patterns where it hurts rather than helps. Removing this optimization would get us a fair amount of simplification in a number of read/write APIs.

Comment on lines +407 to +408
static void SetFileReadCalls(Long64_t readcalls = 0);
static void SetReadaheadSize(Long64_t bytes = 256000);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe put those in another commit/PR.

@ferdymercury
Copy link
Collaborator Author

I think generally we want to use std::size_t for buffer sizes instead of Long64_t.

Even if that changes the data type from signed to unsigned?

Also, wouldn't it be better to have ULong64_t instead of std::size_t?

Since size_t is unsigned int for 32-bit targets and unsigned long long or unsigned long for 64-bit targets, depending on the C++ implementation and the compilation target.
Where as ULong64_t would always be unsigned long long.

@jblomer
Copy link
Contributor

jblomer commented Aug 14, 2025

Even if that changes the data type from signed to unsigned?

Also, wouldn't it be better to have ULong64_t instead of std::size_t?

Since size_t is unsigned int for 32-bit targets and unsigned long long or unsigned long for 64-bit targets, depending on the C++ implementation and the compilation target. Where as ULong64_t would always be unsigned long long.

In my opinion, that's the point. The std::size_t type is the integer type describing a length of something in memory. This is platform-dependent. On a 32bit platform, e.g., there will never be a buffer > 4GB.

@ferdymercury
Copy link
Collaborator Author

ferdymercury commented Aug 14, 2025

In my opinion, that's the point. The std::size_t type is the integer type describing a length of something in memory. This is platform-dependent. On a 32bit platform, e.g., there will never be a buffer > 4GB.

Sounds reasonable. What if a TTree in the future contains a big entry over 4GB ? Does it mean that it won't be read in 32-bits? How do we error out then, or it's silently cropped? Or is it going to be emulated as several buffers one after the other?
Using a data type larger than the actual type (ulong64 vs size_t) leaves room for detecting those kind of situations.
But maybe it's all very corner cases and is not worth?

@jblomer
Copy link
Contributor

jblomer commented Aug 14, 2025

In my opinion, that's the point. The std::size_t type is the integer type describing a length of something in memory. This is platform-dependent. On a 32bit platform, e.g., there will never be a buffer > 4GB.

Sounds reasonable. What if a TTree in the future contains a big entry over 4GB ? Does it mean that it won't be read in 32-bits? How do we error out then, or it's silently cropped? Or is it going to be emulated as several buffers one after the other? Using a data type larger than the actual type (ulong64 vs size_t) leaves room for detecting those kind of situations. But maybe it's all very corner cases and is not worth?

I think generally we have to distinguish between the in-memory buffer and what's serialized to disk. If a big atomic object (e.g., a histogram) is serialized to disk, it can't be read back on 32bit platforms. I think that's fine and unavoidable. The 32bit machine is simply not capable enough. On disk, of course, we will need to represent the size of objects in a platform-independent way. I think that the deserialization of the object length will be the proper point to throw errors.

Regarding the concrete on-disk representation, the plan is to chunk large objects in multiple keys to keep the changes to the TFile on-disk format minimal.

@pcanal
Copy link
Member

pcanal commented Aug 14, 2025

do we have an indication that the optimization of initializing the buffer size to the average buffer size seen so far in the file is actually useful?

This is hard to really measure for sure as it is of course very dependent of the actual workload. When this was introduced, this was in direct reaction to issues related to not only memory fragmentation (increase of process virtual size due to the inability to re-use some memory that just a tad too small) but also thread scaling (by reducing the amount of memory allocation which requires (in most cases) the system to take a global lock).

@pcanal
Copy link
Member

pcanal commented Aug 14, 2025

On a 32bit platform, e.g., there will never be a buffer > 4GB.

On the other hand we need to also make sure we probably error-out when there is a request for it ...

@pcanal
Copy link
Member

pcanal commented Aug 14, 2025

What if a TTree in the future contains a big entry over 4GB ? ... Or is it going to be emulated as several buffers one after the other?

That is the current plan.

@@ -170,8 +170,8 @@ class TObject {
virtual void SetDrawOption(Option_t *option=""); // *MENU*
virtual void SetUniqueID(UInt_t uid);
virtual void UseCurrentStyle();
virtual Int_t Write(const char *name = nullptr, Int_t option = 0, Int_t bufsize = 0);
virtual Int_t Write(const char *name = nullptr, Int_t option = 0, Int_t bufsize = 0) const;
virtual Int_t Write(const char *name = nullptr, Int_t option = 0, Long64_t bufsize = 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might be necessary but is a serious problem. This function is overload a lot in both our code but also very possibly in user code. Unless those user have upgrade their code to use the override keyword (which is unlikely in my opinion), their code will compile correctly but do the wrong thing (revert to use the default behavior rather than their customization .... )

Copy link
Collaborator Author

@ferdymercury ferdymercury Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we implement a dummy

Int_t Write(const char *name, Int_t option, Int_t bufsize) const final;

to trigger a compilation error, or at least a warning and avoid that silenced wrong behavior?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would indeed provoke a compilation error ....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TBuffer* classes should abort in case the 1GB limit is being hit
3 participants