-
Notifications
You must be signed in to change notification settings - Fork 1k
Commit 6874ffa
[Variant] Avoid extra allocation in object builder (#7935)
# Which issue does this PR close?
- Closes #7899 .
This pr wants to avoid the extra allocation for the object builder and
the later buffer copy.
# Rationale for this change
Avoid extra allocation in the object builder like the issue descripted.
# What changes are included in this PR?
This removes the internal `buffer` in `ObjectBuilder`. All data
insertion is done directly to the parent buffer wrapped in
`parent_state`.
The corresponding new fields are added to `ObjectBuilder`.
- add `object_start_offset` in `ObjectBuilder`, which describes the
start offset in the parent buffer for the current object
- Add `has_been_finished` in `ObjectBuilder`, which describes whether
the current object has been finished; it will be used in the `Drop`
function.
This patch modifies the logic of `new`, `finish`, `parent_state`, and
`drop` function according to the change.
In particular, it writes data into the parent buffer directly when
adding a field to the object (i.e., `insert`/`try_insert` is called).
When finalizing (`finish` is called) the object, as header and field ids
are must be put in front of data in the buffer, the builder will shift
written data bytes for the necessary space for header and field ids.
Then it writes header and field ids.
In `drop`, if the builder is not finalized before being dropped, it will
truncate the written bytes to roll back the parent buffer status.
# Are these changes tested?
The logic has been covered by the exist logic.
# Are there any user-facing changes?
No
---------
Co-authored-by: Andrew Lamb <[email protected]>1 parent d4f1cfa commit 6874ffaCopy full SHA for 6874ffa
File tree
Expand file treeCollapse file tree
1 file changed
+411
-56
lines changedFilter options
- parquet-variant/src
Expand file treeCollapse file tree
1 file changed
+411
-56
lines changed
0 commit comments