fix: [Geneva uploader] Optimize upload path by using Bytes to eliminate unnecessary clones (specifically for retry scenario) #470
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes
Replaces
Vec<u8>withbytes::Bytesfor compressed telemetry data to eliminate expensive memory clones in the upload path, especially critical for retry scenarios.Problem
The
upload_batch()method was cloning the entire compressed payload (potentially MBs) on every upload attempt:Impact for a 1 MB batch with 3 retry attempts:
Solution
Changed EncodedBatch.data from Vec to bytes::Bytes:
How it works:
Bytes::from(Vec<u8>)does NOT copy - it takes ownership of theVec's allocation and wraps it in a reference-counted container (zero-copy conversion)Bytes::clone()just increments an atomic refcount - no data is copiedWhy Bytes instead of Arc?
While
Arc<Vec<u8>>would also provide cheap cloning,Bytesis the better choice for our HTTP upload use case:reqwestintegration:reqwest::Bodyhasimpl From<Bytes>that wraps the buffer directly without copying (used at uploader.rs:242). WithArc, we'd need topass .as_ref()(giving &[u8]), which causesreqwestto allocate internally when converting to Body.Bytesis the standard type for byte buffers in Rust's async ecosystem (tokio,hyper,reqwest), ensuring optimal integration.Bytes::slice()provides zero-copy views into the buffer, which could be useful for - Optimizing Bond encoding pipeline (bond_encoder.rs,central_blob.rs) if we identify unnecessary intermediate copies during schema/event serialization.Merge requirement checklist
CHANGELOG.mdfiles updated for non-trivial, user-facing changes