Skip to content

Conversation

scovich
Copy link

@scovich scovich commented Sep 4, 2025

Merging apache#8166 with upstream main was a bit hairy because of strong logical conflicts with apache#8179.

Hopefully this helps unblock the PR.

sdf-jkl and others added 30 commits August 22, 2025 17:18
…iant kernel (apache#8201)

# Which issue does this PR close?

- Closes apache#8060.

# Rationale for this change

Need to implement `List`, `LargeList` types support for
`cast_to_variant` kernel

# What changes are included in this PR?

Added support for `List`, `LargeList` in `cast_to_variant` kernel

# Are these changes tested?

Yes, added unit tests

# Are there any user-facing changes?

Yes, added changes to the `cast_to_variant` kernel

---------

Co-authored-by: Konstantin.Tarasov <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
# Which issue does this PR close?

- Part of apache#4886

# Rationale for this change

This PR introduces benchmark tests for the `AvroWriter` in the
`arrow-avro` crate. Adding these benchmarks is essential for tracking
the performance of the writer, identifying potential regressions, and
guiding future optimizations.

# What changes are included in this PR?

A new benchmark file, `benches/avro_writer.rs`, is added to the project.
This file contains a suite of benchmarks that measure the performance of
writing `RecordBatch`es to the Avro format.

The benchmarks cover a variety of Arrow data types:
- `Boolean`
- `Int32` and `Int64`
- `Float32` and `Float64`
- `Binary`
- `Timestamp` (Microsecond precision)
- A schema with a mix of the above types

These benchmarks are run with varying numbers of rows (100, 10,000, and
1,000,000) to assess performance across different data scales.

# Are these changes tested?

Yes, this pull request consists entirely of new benchmark tests.
Therefore, no separate tests are needed.

# Are there any user-facing changes?

NA
This method was removed in apache#7824, which introduced an optimized code
path for writing bloom filters on little-endian architectures. The
method was however still used in the big-endian code-path. Due to the
use of `#[cfg(target_endian)]` this went unnoticed in CI.

Fixes apache#8207
…che#8177)

# Which issue does this PR close?

- Closes apache#8063

# Rationale for this change
Maps are now cast to `Variant::Object`s

# What changes are included in this PR?

# Are these changes tested?
Yes

# Are there any user-facing changes?

---------

Co-authored-by: Andrew Lamb <[email protected]>
…he#8105)

# Which issue does this PR close?

We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax.

- Closes apache#8091 .

# Rationale for this change

Implement `VariantArray::value` for some more shredded variants(eg.
primitive_conversion/generic_conversion/non_generic_conversion).

# What changes are included in this PR?

- Extract all `macroRules` to a separate module `type_conversion.rs`
- Add a macro for `variant value`

# Are these changes tested?

Covered by the existing test


# Are there any user-facing changes?

No
…kernel (apache#8196)

# Which issue does this PR close?

- Closes apache#8195.

# Rationale for this change

# What changes are included in this PR?

Implement `DataType::Union` for `cast_to_variant`

# Are these changes tested?

Yes

# Are there any user-facing changes?

New cast type supported

---------

Co-authored-by: Andrew Lamb <[email protected]>
…pache#8206)

# Which issue does this PR close?

- Closes apache#8205

# Rationale for this change

`VariantArrayBuilder` had a very complex choreography with the
`VariantBuilder` API, that required lots of manual drop glue to deal
with ownership transfers between it and the `VariantArrayVariantBuilder`
it delegates the actual work to. Rework the whole thing to use a
(now-reusable) `MetadataBuilder` and `ValueBuilder`, with rollbacks
largely handled by `ParentState` -- just like the other builders in the
parquet-variant crate.

# What changes are included in this PR?

Five changes (curated as five commits that reviewers may want to examine
individually):
1. Make a bunch of parquet-variant builder infrastructure public, so
that `VariantArrayBuilder` can access it from the
parquet-variant-compute crate.
2. Make `MetadataBuilder` reusable. Its `finish` method appends the
bytes of a new serialized metadata dictionary to the underlying buffer
and resets the remaining builder state. The builder is thus ready to
create a brand new metadata dictionary whose serialized bytes will also
be appended to the underlying buffer once finished.
3. Rework `VariantArrayBuilder` to use `MetadataBuilder` and
`ValueBuilder`, coordinated via `ParentState`. This is the main feature
of the PR and also the most complicated/subtle.
4. Delete now-unused code that had been added previously in order to
support the old implementation of `VariantArrayBuilder`.
5. Add missing doc comments for now-public types and methods


# Are these changes tested?

Existing variant array builder tests cover the change.

# Are there any user-facing changes?

A lot of builder-related types and methods from the parquet-variant
crate are now public.
…g fails (apache#8213)

# Which issue does this PR close?

- Closes apache#8212

# Rationale for this change

In the original code, the bitmap was modified before decoding. Even if
decoding fails, the null buffer was modified, leading to bitmap
corruption, eventually causing flush to fail.

# What changes are included in this PR?

This PR fixes the bug where the bitmap was modified before decoding. If
there is decoding failure, the bitmap should not be modified but the
decode method should be exited gracefully without any side effect.

# Are these changes tested?

- Added a unit test

# Are there any user-facing changes?

No.
# Which issue does this PR close?

- Closes apache#8152

# Rationale for this change

When manipulating existing variant values (unshredding, removing fields,
etc), the metadata column is already defined and already contains all
necessary field ids. In fact, defining new/different field ids would
require rewriting the bytes of those already-encoded variant values. We
need a way to build variant values that rely on an existing metadata
dictionary.

# What changes are included in this PR?

* `MetadataBuilder` is now a trait, and most methods that work with
metadata builders now take `&mut dyn MetadataBuilder` instead of `&mut
MetadataBuilder`.
* The old `MetadataBuilder` struct is now `BasicMetadataBuilder` that
implements `MetadataBuilder`
* Define a `ReadOnlyMetadataBuilder` that wraps a `VariantMetadata` and
which also implements `MetadataBuilder`
* Update the `try_binary_search_range_by` helper method to be more
general, so we can define an efficient `VariantMetadata::get_entry` that
returns the field id for a given field name.

# Are these changes tested?

Existing tests cover the basic metadata builder. New tests added to
cover the read-only metadata builder.

# Are there any user-facing changes?

The renamed `BasicMetadataBuilder` (breaking), the new `MetadataBuilder`
trait (breaking), and the new `ReadOnlyMetadataBuilder`.
Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps
[actions/upload-pages-artifact](https://github.com/actions/upload-pages-artifact)
from 3 to 4.

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…he#8210)

# Which issue does this PR close?

Closes apache#8209

# Rationale for this change

In the Field struct definition
```
/// A field within a [`Record`]
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct Field<'a> {
    /// Name of the field within the record
    #[serde(borrow)]
    pub name: &'a str,
    /// Optional documentation for this field
    #[serde(borrow, default)]
    pub doc: Option<&'a str>,
    /// The field's type definition
    #[serde(borrow)]
    pub r#type: Schema<'a>,
    /// Optional default value for this field
    #[serde(borrow, default)]
    pub default: Option<&'a str>,
}
```
type is of type `Schema` whereas default is of type `str`. The default
should be supported for all types (e.g. int, array, map, nested record),
so we should make it more lenient.

More details on reproduction is mentioned in the Github Issue.

# What changes are included in this PR?

Relaxation of default type of avro scheam Field.

# Are these changes tested?

Added a unit test.

# Are there any user-facing changes?

It affects `pub struct Field` of `arrow-avro` package, but the impact
should be minimal as the `default` attribute is not being used.
…_type` (apache#8216)

# Which issue does this PR close?

None.

# Rationale for this change

I noticed an error in the doc comment about error conditions of
`Field::try_canonical_extension_type`.

# What changes are included in this PR?

Fixed the doc comment.

# Are these changes tested?

No.

# Are there any user-facing changes?

No.
…t` kernel (apache#8215)

# Which issue does this PR close?

- Closes apache#8194.

# Rationale for this change

# What changes are included in this PR?

Implement `duration` the same as `interval`

# Are these changes tested?

Yes

# Are there any user-facing changes?
…pache#8141)

# Which issue does this PR close?

- Closes apache#8217

# Rationale for this change

When working with shredded variants, we need the ability to copy nested
object fields and array elements of one variant to a destination. This
is a cheap byte-wise copy that relies on the fact that the new variant
being built uses the same metadata dictionary as the source variant it
is derived from.

# What changes are included in this PR?

Define a helper macro that encapsulates the logic for variant appends,
now that we have three very similar methods (differing only in their
handling of list/object values and their return type).

Add new methods: `ValueBuilder::append_variant_bytes`, which is called
by new methods `VariantBuilder::append_value_bytes`,
`ListBuilder::append_value_bytes`, and
`ObjectBuilder::[try_]insert_bytes`.

# Are these changes tested?

New unit tests

# Are there any user-facing changes?

The new methods are public.

---------

Co-authored-by: Andrew Lamb <[email protected]>
…e#8214)

# Which issue does this PR close?

- Closes apache#8184

# Rationale for this change


# What changes are included in this PR?

There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.

# Are these changes tested?
Yes

# Are there any user-facing changes?
`Object::finish` doesn't return `Result` anymore

---------

Co-authored-by: Andrew Lamb <[email protected]>
Reverts:
- apache#8183 

Because the related issue was closed:
- apache#8181
# Which issue does this PR close?

- Closes apache#8243 .

# What changes are included in this PR?

pin `comfy-table` to release prior to 7.2.0's MSRV bump to 1.85 -
included a TODO to unpin after arrow bumps to 1.85

(context FWIW: caught in delta_kernel [MSRV
CI](https://github.com/delta-io/delta-kernel-rs/actions/runs/17310376492/job/49143119497))

# Are these changes tested?
validated MSRV with cargo-msrv:
```bash
# now passes
cargo msrv --path arrow-cast/ verify --rust-version 1.84 --all-features
```
# Which issue does this PR close?

- Closes apache#8228.


# What changes are included in this PR?

Add `Variant::as_f16`

# Are these changes tested?

Added  doc tests

# Are there any user-facing changes?

Added doc for the function

---------

Co-authored-by: Matthijs Brobbel <[email protected]>
Updates the requirements on
[hashbrown](https://github.com/rust-lang/hashbrown) to permit the latest
version.


Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# Which issue does this PR close?

The doc for lexsort says it's stable. However, it's an unstable sort.

# Rationale for this change

Fix the document.

# What changes are included in this PR?

Fix the document.

# Are these changes tested?

No need

# Are there any user-facing changes?

Doc change

---------

Co-authored-by: Matthijs Brobbel <[email protected]>
# Which issue does this PR close?
\-

# Rationale for this change
Some services support gRPC compression. Expose this to the CLI client
for:

- testing
- more efficient data transfer over slow internet connections

# What changes are included in this PR?
CLI argument wiring.

# Are these changes tested?
No automated tests. I think we can assume that the libraries we use do
what they promise to do. But I also verified that this works by
inspecting the traffic using Wireshark.

# Are there any user-facing changes?
They now have more options.
# Which issue does this PR close?

- Part of apache#4886
- Extends work initiated in apache#8006

# Rationale for this change

This introduces support for Confluent schema registry ID handling in the
arrow-avro crate, adding compatibility with Confluent's wire format.
These improvements enable streaming Apache Kafka, Redpanda, and Pulsar
messages with Avro schemas directly into arrow-rs.

# What changes are included in this PR?

- Adds Confluent support
- Adds initial support for SHA256 and MD5 algorithm types. Rabin remains
the default.

# Are these changes tested?

Yes, existing tests are all passing, and tests for ID handling have been
added. Benchmark results show no appreciable changes.

# Are there any user-facing changes?

- Confluent users need to provide the ID fingerprint when using the
`set` method, unlike the `register` method which generates it from the
schema on the fly. Existing API behavior has been maintained.

- SchemaStore TryFrom now accepts a `&HashMap<Fingerprint, AvroSchema>`,
rather than a `&[AvroSchema]`


Huge shout out to @jecsand838 for his collaboration on this!

---------

Co-authored-by: Connor Sanders <[email protected]>
Bumps [actions/setup-python](https://github.com/actions/setup-python)
from 5 to 6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/setup-python/releases">actions/setup-python's
releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<h2>What's Changed</h2>
<h3>Breaking Changes</h3>
<ul>
<li>Upgrade to node 24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1164">actions/setup-python#1164</a></li>
</ul>
<p>Make sure your runner is on version v2.327.1 or later to ensure
compatibility with this release. <a
href="https://github.com/actions/runner/releases/tag/v2.327.1">See
Release Notes</a></p>
<h3>Enhancements:</h3>
<ul>
<li>Add support for <code>pip-version</code> by <a
href="https://github.com/priyagupta108"><code>@​priyagupta108</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1129">actions/setup-python#1129</a></li>
<li>Enhance reading from .python-version by <a
href="https://github.com/krystof-k"><code>@​krystof-k</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/787">actions/setup-python#787</a></li>
<li>Add version parsing from Pipfile by <a
href="https://github.com/aradkdj"><code>@​aradkdj</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1067">actions/setup-python#1067</a></li>
</ul>
<h3>Bug fixes:</h3>
<ul>
<li>Clarify pythonLocation behaviour for PyPy and GraalPy in environment
variables by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1183">actions/setup-python#1183</a></li>
<li>Change missing cache directory error to warning by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1182">actions/setup-python#1182</a></li>
<li>Add Architecture-Specific PATH Management for Python with --user
Flag on Windows by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1122">actions/setup-python#1122</a></li>
<li>Include python version in PyPy python-version output by <a
href="https://github.com/cdce8p"><code>@​cdce8p</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1110">actions/setup-python#1110</a></li>
<li>Update docs: clarification on pip authentication with setup-python
by <a
href="https://github.com/priya-kinthali"><code>@​priya-kinthali</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1156">actions/setup-python#1156</a></li>
</ul>
<h3>Dependency updates:</h3>
<ul>
<li>Upgrade idna from 2.9 to 3.7 in /<strong>tests</strong>/data by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-python/pull/843">actions/setup-python#843</a></li>
<li>Upgrade form-data to fix critical vulnerabilities <a
href="https://redirect.github.com/actions/setup-python/issues/182">#182</a>
&amp; <a
href="https://redirect.github.com/actions/setup-python/issues/183">#183</a>
by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1163">actions/setup-python#1163</a></li>
<li>Upgrade setuptools to 78.1.1 to fix path traversal vulnerability in
PackageIndex.download by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1165">actions/setup-python#1165</a></li>
<li>Upgrade actions/checkout from 4 to 5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-python/pull/1181">actions/setup-python#1181</a></li>
<li>Upgrade <code>@​actions/tool-cache</code> from 2.0.1 to 2.0.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-python/pull/1095">actions/setup-python#1095</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/krystof-k"><code>@​krystof-k</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-python/pull/787">actions/setup-python#787</a></li>
<li><a href="https://github.com/cdce8p"><code>@​cdce8p</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/setup-python/pull/1110">actions/setup-python#1110</a></li>
<li><a href="https://github.com/aradkdj"><code>@​aradkdj</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/setup-python/pull/1067">actions/setup-python#1067</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-python/compare/v5...v6.0.0">https://github.com/actions/setup-python/compare/v5...v6.0.0</a></p>
<h2>v5.6.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Workflow updates related to Ubuntu 20.04 by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1065">actions/setup-python#1065</a></li>
<li>Fix for Candidate Not Iterable Error by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1082">actions/setup-python#1082</a></li>
<li>Upgrade semver and <code>@​types/semver</code> by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1091">actions/setup-python#1091</a></li>
<li>Upgrade prettier from 2.8.8 to 3.5.3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1046">actions/setup-python#1046</a></li>
<li>Upgrade ts-jest from 29.1.2 to 29.3.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1081">actions/setup-python#1081</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-python/compare/v5...v5.6.0">https://github.com/actions/setup-python/compare/v5...v5.6.0</a></p>
<h2>v5.5.0</h2>
<h2>What's Changed</h2>
<h3>Enhancements:</h3>
<ul>
<li>Support free threaded Python versions like '3.13t' by <a
href="https://github.com/colesbury"><code>@​colesbury</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/973">actions/setup-python#973</a></li>
<li>Enhance Workflows: Include ubuntu-arm runners, Add e2e Testing for
free threaded and Upgrade <code>@​action/cache</code> from 4.0.0 to
4.0.3 by <a
href="https://github.com/priya-kinthali"><code>@​priya-kinthali</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1056">actions/setup-python#1056</a></li>
<li>Add support for .tool-versions file in setup-python by <a
href="https://github.com/mahabaleshwars"><code>@​mahabaleshwars</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1043">actions/setup-python#1043</a></li>
</ul>
<h3>Bug fixes:</h3>
<ul>
<li>Fix architecture for pypy on Linux ARM64 by <a
href="https://github.com/mayeut"><code>@​mayeut</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1011">actions/setup-python#1011</a>
This update maps arm64 to aarch64 for Linux ARM64 PyPy
installations.</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/actions/setup-python/commit/e797f83bcb11b83ae66e0230d6156d7c80228e7c"><code>e797f83</code></a>
Upgrade to node 24 (<a
href="https://redirect.github.com/actions/setup-python/issues/1164">#1164</a>)</li>
<li><a
href="https://github.com/actions/setup-python/commit/3d1e2d2ca0a067f27da6fec484fce7f5256def85"><code>3d1e2d2</code></a>
Revert &quot;Enhance cache-dependency-path handling to support files
outside the w...</li>
<li><a
href="https://github.com/actions/setup-python/commit/65b071217a8539818fdb8b54561bcbae40380a54"><code>65b0712</code></a>
Clarify pythonLocation behavior for PyPy and GraalPy in environment
variables...</li>
<li><a
href="https://github.com/actions/setup-python/commit/5b668cf7652160527499ee14ceaff4be9306cb88"><code>5b668cf</code></a>
Bump actions/checkout from 4 to 5 (<a
href="https://redirect.github.com/actions/setup-python/issues/1181">#1181</a>)</li>
<li><a
href="https://github.com/actions/setup-python/commit/f62a0e252fe7114e86949abfa6e1e89f85bb38c2"><code>f62a0e2</code></a>
Change missing cache directory error to warning (<a
href="https://redirect.github.com/actions/setup-python/issues/1182">#1182</a>)</li>
<li><a
href="https://github.com/actions/setup-python/commit/9322b3ca74000aeb2c01eb777b646334015ddd72"><code>9322b3c</code></a>
Upgrade setuptools to 78.1.1 to fix path traversal vulnerability in
PackageIn...</li>
<li><a
href="https://github.com/actions/setup-python/commit/fbeb884f69f0ac1c0257302f62aa524c2824b649"><code>fbeb884</code></a>
Bump form-data to fix critical vulnerabilities <a
href="https://redirect.github.com/actions/setup-python/issues/182">#182</a>
&amp; <a
href="https://redirect.github.com/actions/setup-python/issues/183">#183</a>
(<a
href="https://redirect.github.com/actions/setup-python/issues/1163">#1163</a>)</li>
<li><a
href="https://github.com/actions/setup-python/commit/03bb6152f4f691b9d64579a1bd791904a083c452"><code>03bb615</code></a>
Bump idna from 2.9 to 3.7 in /<strong>tests</strong>/data (<a
href="https://redirect.github.com/actions/setup-python/issues/843">#843</a>)</li>
<li><a
href="https://github.com/actions/setup-python/commit/36da51d563b70a972897150555bb025096d65565"><code>36da51d</code></a>
Add version parsing from Pipfile (<a
href="https://redirect.github.com/actions/setup-python/issues/1067">#1067</a>)</li>
<li><a
href="https://github.com/actions/setup-python/commit/3c6f142cc0036d53007e92fa1e327564a4cfb7aa"><code>3c6f142</code></a>
update documentation (<a
href="https://redirect.github.com/actions/setup-python/issues/1156">#1156</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/actions/setup-python/compare/v5...v6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-python&package-manager=github_actions&previous-version=5&new-version=6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/setup-node](https://github.com/actions/setup-node) from 4
to 5.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/setup-node/releases">actions/setup-node's
releases</a>.</em></p>
<blockquote>
<h2>v5.0.0</h2>
<h2>What's Changed</h2>
<h3>Breaking Changes</h3>
<ul>
<li>Upgrade action to use node24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/setup-node/pull/1325">actions/setup-node#1325</a></li>
</ul>
<p>Make sure your runner is updated to this version or newer to use this
release. v2.327.1 <a
href="https://github.com/actions/runner/releases/tag/v2.327.1">Release
Notes</a></p>
<h3>Dependency Upgrades</h3>
<ul>
<li>Upgrade <code>@​octokit/request-error</code> and
<code>@​actions/github</code> by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1227">actions/setup-node#1227</a></li>
<li>Upgrade uuid from 9.0.1 to 11.1.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1273">actions/setup-node#1273</a></li>
<li>Upgrade undici from 5.28.5 to 5.29.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1295">actions/setup-node#1295</a></li>
<li>Upgrade form-data to bring in fix for critical vulnerability by <a
href="https://github.com/gowridurgad"><code>@​gowridurgad</code></a> in
<a
href="https://redirect.github.com/actions/setup-node/pull/1332">actions/setup-node#1332</a></li>
<li>Upgrade actions/checkout from 4 to 5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1345">actions/setup-node#1345</a></li>
</ul>
<h3>Enhancement:</h3>
<ul>
<li>Enhance caching in setup-node with automatic package manager
detection by <a
href="https://github.com/priya-kinthali"><code>@​priya-kinthali</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/1348">actions/setup-node#1348</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/priya-kinthali"><code>@​priya-kinthali</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-node/pull/1348">actions/setup-node#1348</a></li>
<li><a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-node/pull/1325">actions/setup-node#1325</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-node/compare/v4...v5.0.0">https://github.com/actions/setup-node/compare/v4...v5.0.0</a></p>
<h2>v4.4.0</h2>
<h2>What's Changed</h2>
<h3>Bug fixes:</h3>
<ul>
<li>Make eslint-compact matcher compatible with Stylelint by <a
href="https://github.com/FloEdelmann"><code>@​FloEdelmann</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/98">actions/setup-node#98</a></li>
<li>Add support for indented eslint output by <a
href="https://github.com/fregante"><code>@​fregante</code></a> in <a
href="https://redirect.github.com/actions/setup-node/pull/1245">actions/setup-node#1245</a></li>
</ul>
<h3>Enhancement:</h3>
<ul>
<li>Support private mirrors by <a
href="https://github.com/marco-ippolito"><code>@​marco-ippolito</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/1240">actions/setup-node#1240</a></li>
</ul>
<h3>Dependency update:</h3>
<ul>
<li>Upgrade <code>@​action/cache</code> from 4.0.2 to 4.0.3 by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/1262">actions/setup-node#1262</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/FloEdelmann"><code>@​FloEdelmann</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-node/pull/98">actions/setup-node#98</a></li>
<li><a href="https://github.com/fregante"><code>@​fregante</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-node/pull/1245">actions/setup-node#1245</a></li>
<li><a
href="https://github.com/marco-ippolito"><code>@​marco-ippolito</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-node/pull/1240">actions/setup-node#1240</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-node/compare/v4...v4.4.0">https://github.com/actions/setup-node/compare/v4...v4.4.0</a></p>
<h2>v4.3.0</h2>
<h2>What's Changed</h2>
<h3>Dependency updates</h3>
<ul>
<li>Upgrade <code>@​actions/glob</code> from 0.4.0 to 0.5.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-node/pull/1200">actions/setup-node#1200</a></li>
<li>Upgrade <code>@​action/cache</code> from 4.0.0 to 4.0.2 by <a
href="https://github.com/gowridurgad"><code>@​gowridurgad</code></a> in
<a
href="https://redirect.github.com/actions/setup-node/pull/1251">actions/setup-node#1251</a></li>
<li>Upgrade <code>@​vercel/ncc</code> from 0.38.1 to 0.38.3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-node/pull/1203">actions/setup-node#1203</a></li>
<li>Upgrade <code>@​actions/tool-cache</code> from 2.0.1 to 2.0.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-node/pull/1220">actions/setup-node#1220</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/actions/setup-node/commit/a0853c24544627f65ddf259abe73b1d18a591444"><code>a0853c2</code></a>
Bump actions/checkout from 4 to 5 (<a
href="https://redirect.github.com/actions/setup-node/issues/1345">#1345</a>)</li>
<li><a
href="https://github.com/actions/setup-node/commit/b7234cc9fe124f0f4932554b4e5284543083ae7b"><code>b7234cc</code></a>
Upgrade action to use node24 (<a
href="https://redirect.github.com/actions/setup-node/issues/1325">#1325</a>)</li>
<li><a
href="https://github.com/actions/setup-node/commit/d7a11313b581b306c961b506cfc8971208bb03f6"><code>d7a1131</code></a>
Enhance caching in setup-node with automatic package manager detection
(<a
href="https://redirect.github.com/actions/setup-node/issues/1348">#1348</a>)</li>
<li><a
href="https://github.com/actions/setup-node/commit/5e2628c959b9ade56971c0afcebbe5332d44b398"><code>5e2628c</code></a>
Bumps form-data (<a
href="https://redirect.github.com/actions/setup-node/issues/1332">#1332</a>)</li>
<li><a
href="https://github.com/actions/setup-node/commit/65beceff8e91358525397bdce9103d999507ab03"><code>65becef</code></a>
Bump undici from 5.28.5 to 5.29.0 (<a
href="https://redirect.github.com/actions/setup-node/issues/1295">#1295</a>)</li>
<li><a
href="https://github.com/actions/setup-node/commit/7e24a656e1c7a0d6f3eaef8d8e84ae379a5b035b"><code>7e24a65</code></a>
Bump uuid from 9.0.1 to 11.1.0 (<a
href="https://redirect.github.com/actions/setup-node/issues/1273">#1273</a>)</li>
<li><a
href="https://github.com/actions/setup-node/commit/08f58d1471bff7f3a07d167b4ad7df25d5fcfcb6"><code>08f58d1</code></a>
Bump <code>@​octokit/request-error</code> and
<code>@​actions/github</code> (<a
href="https://redirect.github.com/actions/setup-node/issues/1227">#1227</a>)</li>
<li>See full diff in <a
href="https://github.com/actions/setup-node/compare/v4...v5">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-node&package-manager=github_actions&previous-version=4&new-version=5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…he#8179)

# Which issue does this PR close?

- Closes apache#8178

# Are these changes tested?
Yes

# Are there any user-facing changes?
Can use `variant_get` for shredded numeric types

---------

Co-authored-by: Andrew Lamb <[email protected]>
# Which issue does this PR close?

- Part of apache#4886
- Follows up on apache#8047

# Rationale for this change

When reading Avro into Arrow with a projection or a reader schema that
omits some writer fields, we were still decoding those writer‑only
fields item‑by‑item. This is unnecessary work and can dominate CPU time
for large arrays/maps or deeply nested records.

Avro’s binary format explicitly allows fast skipping for arrays/maps by
encoding data in blocks: when the count is negative, the next `long`
gives the byte size of the block, enabling O(1) skipping of that block
without decoding each item. This PR teaches the record reader to
recognize and leverage that, and to avoid constructing decoders for
fields we will skip altogether.

# What changes are included in this PR?

**Reader / decoding architecture**
- **Skip-aware record decoding**:
- At construction time, we now precompute per-record **skip decoders**
for writer fields that the reader will ignore.
  - Introduced a resolved-record path (`RecordResolved`) that carries:
    - `writer_to_reader` mapping for field alignment,
- a prebuilt list of **skip decoders** for fields not present in the
reader,
    - the set of active per-field decoders for the projected fields.
- **Codec builder enhancements**: In `arrow-avro/src/codec.rs`, record
construction now:
- Builds Arrow `Field`s and their decoders only for fields that are
read,
- Builds `skip_decoders` (via `build_skip_decoders`) for fields to
ignore.
- **Error handling and consistency**: Kept existing strict-mode
behavior; improved internal branching to avoid inconsistent states
during partial decodes.

**Tests**
- **Unit tests (in `arrow-avro/src/reader/record.rs`)**
  - Added focused tests that exercise the new skip logic:
- Skipping writer‑only fields inside **arrays** and **maps** (including
negative‑count block skipping and mixed multi‑block payloads).
- Skipping nested structures within records to ensure offsets and
lengths remain correct for the fields that are read.
- Ensured nullability and union handling remain correct when adjacent
fields are skipped.
- **Integration tests (in `arrow-avro/src/reader/mod.rs`)**
- Added end‑to‑end test using `avro/alltypes_plain.avro` to validate
that projecting a subset of fields (reader schema omits some writer
fields) both:
    - Produces the correct Arrow arrays for the selected fields, and
- Avoids decoding skipped fields (validated indirectly via behavior and
block boundaries).
- The test covers compressed and uncompressed variants already present
in the suite to ensure behavior is consistent across codecs.

# Are these changes tested?

- **New unit tests** cover:
- Fast skipping for arrays/maps using negative block counts and block
sizes (per Avro spec).
- Nested and nullable scenarios to ensure correct offsets, validity
bitmaps, and flush behavior when adjacent fields are skipped.
- **New integration test** in `reader/mod.rs`:
- Reads `avro/alltypes_plain.avro` with a reader schema that omits
several writer fields and asserts the resulting `RecordBatch` matches
the expected arrays while exercising the skip path.
- Existing promotion, enum, decimal, fixed, and union tests continue to
pass, ensuring no regressions in unrelated areas.

# Are there any user-facing changes?

N/A since `arrow-avro` is not public yet.
Bumps [actions/labeler](https://github.com/actions/labeler) from 5.0.0
to 6.0.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/labeler/releases">actions/labeler's
releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Add workflow file for publishing releases to immutable action
package by <a
href="https://github.com/jcambass"><code>@​jcambass</code></a> in <a
href="https://redirect.github.com/actions/labeler/pull/802">actions/labeler#802</a></li>
</ul>
<h3>Breaking Changes</h3>
<ul>
<li>Upgrade Node.js version to 24 in action and dependencies <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/labeler/pull/891">actions/labeler#891</a>
Make sure your runner is on version v2.327.1 or later to ensure
compatibility with this release. <a
href="https://github.com/actions/runner/releases/tag/v2.327.1">Release
Notes</a></li>
</ul>
<h3>Dependency Upgrades</h3>
<ul>
<li>Upgrade eslint-config-prettier from 9.0.0 to 9.1.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/711">actions/labeler#711</a></li>
<li>Upgrade eslint from 8.52.0 to 8.55.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/720">actions/labeler#720</a></li>
<li>Upgrade <code>@​types/jest</code> from 29.5.6 to 29.5.11 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/719">actions/labeler#719</a></li>
<li>Upgrade <code>@​types/js-yaml</code> from 4.0.8 to 4.0.9 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/718">actions/labeler#718</a></li>
<li>Upgrade <code>@​typescript-eslint/parser</code> from 6.9.0 to 6.14.0
by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/717">actions/labeler#717</a></li>
<li>Upgrade prettier from 3.0.3 to 3.1.1 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/726">actions/labeler#726</a></li>
<li>Upgrade eslint from 8.55.0 to 8.56.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/725">actions/labeler#725</a></li>
<li>Upgrade <code>@​typescript-eslint/parser</code> from 6.14.0 to
6.19.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/745">actions/labeler#745</a></li>
<li>Upgrade eslint-plugin-jest from 27.4.3 to 27.6.3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/744">actions/labeler#744</a></li>
<li>Upgrade <code>@​typescript-eslint/eslint-plugin</code> from 6.9.0 to
6.20.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/750">actions/labeler#750</a></li>
<li>Upgrade prettier from 3.1.1 to 3.2.5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/752">actions/labeler#752</a></li>
<li>Upgrade undici from 5.26.5 to 5.28.3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/757">actions/labeler#757</a></li>
<li>Upgrade braces from 3.0.2 to 3.0.3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/789">actions/labeler#789</a></li>
<li>Upgrade minimatch from 9.0.3 to 10.0.1 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/805">actions/labeler#805</a></li>
<li>Upgrade <code>@​actions/core</code> from 1.10.1 to 1.11.1 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/811">actions/labeler#811</a></li>
<li>Upgrade typescript from 5.4.3 to 5.7.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/819">actions/labeler#819</a></li>
<li>Upgrade <code>@​typescript-eslint/parser</code> from 7.3.1 to 8.17.0
by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/824">actions/labeler#824</a></li>
<li>Upgrade prettier from 3.2.5 to 3.4.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/825">actions/labeler#825</a></li>
<li>Upgrade <code>@​types/jest</code> from 29.5.12 to 29.5.14 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/827">actions/labeler#827</a></li>
<li>Upgrade eslint-plugin-jest from 27.9.0 to 28.9.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/832">actions/labeler#832</a></li>
<li>Upgrade ts-jest from 29.1.2 to 29.2.5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/831">actions/labeler#831</a></li>
<li>Upgrade <code>@​vercel/ncc</code> from 0.38.1 to 0.38.3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/830">actions/labeler#830</a></li>
<li>Upgrade typescript from 5.7.2 to 5.7.3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/835">actions/labeler#835</a></li>
<li>Upgrade eslint-plugin-jest from 28.9.0 to 28.11.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/839">actions/labeler#839</a></li>
<li>Upgrade undici from 5.28.4 to 5.28.5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/842">actions/labeler#842</a></li>
<li>Upgrade <code>@​octokit/request-error</code> from 5.0.1 to 5.1.1 by
<a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/labeler/pull/846">actions/labeler#846</a></li>
</ul>
<h3>Documentation changes</h3>
<ul>
<li>Add note regarding <code>pull_request_target</code> to README.md by
<a href="https://github.com/silverwind"><code>@​silverwind</code></a> in
<a
href="https://redirect.github.com/actions/labeler/pull/669">actions/labeler#669</a></li>
<li>Update readme with additional examples and important note about
<code>pull_request_target</code> event by <a
href="https://github.com/IvanZosimov"><code>@​IvanZosimov</code></a> in
<a
href="https://redirect.github.com/actions/labeler/pull/721">actions/labeler#721</a></li>
<li>Document update - permission section by <a
href="https://github.com/harithavattikuti"><code>@​harithavattikuti</code></a>
in <a
href="https://redirect.github.com/actions/labeler/pull/840">actions/labeler#840</a></li>
<li>Improvement in documentation for pull_request_target event usage in
README by <a
href="https://github.com/suyashgaonkar"><code>@​suyashgaonkar</code></a>
in <a
href="https://redirect.github.com/actions/labeler/pull/871">actions/labeler#871</a></li>
<li>Fix broken links in documentation by <a
href="https://github.com/suyashgaonkar"><code>@​suyashgaonkar</code></a>
in <a
href="https://redirect.github.com/actions/labeler/pull/822">actions/labeler#822</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/silverwind"><code>@​silverwind</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/labeler/pull/669">actions/labeler#669</a></li>
<li><a href="https://github.com/Jcambass"><code>@​Jcambass</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/labeler/pull/802">actions/labeler#802</a></li>
<li><a
href="https://github.com/suyashgaonkar"><code>@​suyashgaonkar</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/labeler/pull/822">actions/labeler#822</a></li>
<li><a
href="https://github.com/HarithaVattikuti"><code>@​HarithaVattikuti</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/labeler/pull/840">actions/labeler#840</a></li>
<li><a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/labeler/pull/891">actions/labeler#891</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/actions/labeler/commit/f1a63e87db0c6baf19c5713083f8d00d789ca184"><code>f1a63e8</code></a>
Update Node.js version to 24 in action and dependencies (<a
href="https://redirect.github.com/actions/labeler/issues/891">#891</a>)</li>
<li><a
href="https://github.com/actions/labeler/commit/b0a1180683c9f17424de4d71c044bea4c7b9bc7c"><code>b0a1180</code></a>
Bump <code>@​octokit/request-error</code> from 5.0.1 to 5.1.1 (<a
href="https://redirect.github.com/actions/labeler/issues/846">#846</a>)</li>
<li><a
href="https://github.com/actions/labeler/commit/110d44140c9195b853f2f24044bbfed8f4968efb"><code>110d441</code></a>
Update README.md (<a
href="https://redirect.github.com/actions/labeler/issues/871">#871</a>)</li>
<li><a
href="https://github.com/actions/labeler/commit/bee50fefe18762fad67754b2f3bfff2c8082ebb8"><code>bee50fe</code></a>
Bump undici from 5.28.4 to 5.28.5 (<a
href="https://redirect.github.com/actions/labeler/issues/842">#842</a>)</li>
<li><a
href="https://github.com/actions/labeler/commit/6463cdb00ee92c05bec55dffc4e1fce250301945"><code>6463cdb</code></a>
Bump eslint-plugin-jest from 28.9.0 to 28.11.0 (<a
href="https://redirect.github.com/actions/labeler/issues/839">#839</a>)</li>
<li><a
href="https://github.com/actions/labeler/commit/c209686724ee12fcc5e6294d1d569b91f86fa691"><code>c209686</code></a>
Bump typescript from 5.7.2 to 5.7.3 (<a
href="https://redirect.github.com/actions/labeler/issues/835">#835</a>)</li>
<li><a
href="https://github.com/actions/labeler/commit/5184940b544b0096088a7b42d1b8a551003d9eb1"><code>5184940</code></a>
Bump <code>@​vercel/ncc</code> from 0.38.1 to 0.38.3 (<a
href="https://redirect.github.com/actions/labeler/issues/830">#830</a>)</li>
<li><a
href="https://github.com/actions/labeler/commit/3629d5568b59204f18786372f6d740d649719488"><code>3629d55</code></a>
Document update - permission section (<a
href="https://redirect.github.com/actions/labeler/issues/840">#840</a>)</li>
<li><a
href="https://github.com/actions/labeler/commit/d24f7f3731b2a06433c0bccc364d560c5329c48f"><code>d24f7f3</code></a>
Bump ts-jest from 29.1.2 to 29.2.5 (<a
href="https://redirect.github.com/actions/labeler/issues/831">#831</a>)</li>
<li><a
href="https://github.com/actions/labeler/commit/425a1f14222185c7500cf43245beafe96356561d"><code>425a1f1</code></a>
Bump eslint-plugin-jest from 27.9.0 to 28.9.0 (<a
href="https://redirect.github.com/actions/labeler/issues/832">#832</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/actions/labeler/compare/v5.0.0...v6.0.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/labeler&package-manager=github_actions&previous-version=5.0.0&new-version=6.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Resolves conflicts between PR 8166 (shredding support) and PR 8179 (multi-type support):

- Preserves PR 8179's comprehensive multi-type support for all numeric primitives
- Keeps PR 8166's superior row builder architecture and shredding support
- Integrates both test suites for complete coverage
- Maintains enhanced path parsing from PR 8166

The merge successfully combines:
- Multi-type variant_get support (Int8, Int16, Int32, Int64, UInt8, UInt16, UInt32, UInt64, Float16, Float32, Float64)
- Advanced shredding capabilities with row builder approach
- Comprehensive test coverage from both PRs
@scovich
Copy link
Author

scovich commented Sep 4, 2025

Hmm, the diff is a real mess. I'm not sure it would actually make sense to merge this directly into your branch, because it probably wouldn't register as a proper merge commit.

kylebarron and others added 17 commits September 5, 2025 14:37
# Which issue does this PR close?


- Closes apache#7173.

# Rationale for this change

Ability to round-trip timezone information.

# What changes are included in this PR?

Impl `Display` for `Tz`

# Are these changes tested?

A simple test that strings round trip.

# Are there any user-facing changes?

New API
# Which issue does this PR close?

- Closes apache#8273 .

# Rationale for this change

When working with the library using encryption, we have sometimes found
it necessary to modify an existing set of `WriterProperties` on a
per-file basis to set specific encryption properties. More generally,
others may need to use an existing set of `WriterProperties` as a
template and modify the properties. I have implemented this feature by
adding an `into_builder` method, which appears to be the standard
approach in other parts of the library.


# Are these changes tested?

Yes, `test_writer_properties_builder` has been updated to add a
round-trip test for `into_builder`.

# Are there any user-facing changes?

Yes. `WriterProperties` now has a new `into_builder` method.

---------

Co-authored-by: Andrew Lamb <[email protected]>
# Which issue does this PR close?

- Part of apache#5854.

# Rationale for this change
Backport changes to allow apples-to-apples comparison of thrift decoding

# What changes are included in this PR?

Adds a page header benchmark and updates bench names to match those in
feature branch.

# Are these changes tested?

No tests needed...only changes to benchmark

# Are there any user-facing changes?

No
Bumps [actions/github-script](https://github.com/actions/github-script)
from 7 to 8.


Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/labeler](https://github.com/actions/labeler) from 6.0.0
to 6.0.1.


Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
# Which issue does this PR close?


- Closes apache#8261.

# Rationale for this change

Add same API between sync and async API

# What changes are included in this PR?

There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.

# Are these changes tested?

Add test_async_arrow_group_writer

# Are there any user-facing changes?

Yes, add two public function get_column_writers, append_row_group for
AsyncArrowWrite
# Which issue does this PR close?

- Closes apache#8283.

# Rationale for this change

Add the `Variant::as_u*` functions`

# Are these changes tested?

Added doc tests

# Are there any user-facing changes?

No
# Which issue does this PR close?

- Closes apache#8234.

# Rationale for this change

# What changes are included in this PR?

- Grouping related data types together (e.g., numeric types, temporal
types).
- Extracting large code snippets from match branches into helper
functions.
- Reordering tests to align with the data type order.

# Are these changes tested?

Covered by existing tests

# Are there any user-facing changes?

N/A
)

# Which issue does this PR close?

- Part of apache#4886
- Follows up on apache#8047

# Rationale for this change

Avro `enum` values are **encoded by index** but are **semantically
identified by symbol name**. During schema evolution it is legal for the
writer and reader to use different enum symbol *orders* so long as the
**symbol set is compatible**. The Avro specification requires that, when
resolving a writer enum against a reader enum, the value be mapped **by
symbol name**, not by the writer’s numeric index. If the writer’s symbol
is not present in the reader’s enum and the reader defines a default,
the default is used; otherwise it is an error.

# What changes are included in this PR?

**Core changes**
- Implement **writer to reader enum symbol remapping**:
- Build a fast lookup table at schema resolution time from **writer enum
index to reader enum index** using symbol **names**.
- Apply this mapping during decode so the produced Arrow dictionary keys
always reference the **reader’s** symbol order.
- If a writer symbol is not found in the reader enum, surface a clear
error.

# Are these changes tested?

Yes. This PR adds comprehensive **unit tests** for enum mapping in
`reader/record.rs` and a **real‑file integration test** in
`reader/mod.rs` using `avro/simple_enum.avro`.

# Are there any user-facing changes?

N/A due to `arrow-avro` not being public yet.
# Which issue does this PR close?
\-

# Rationale for this change
This is apache#4875 now that the upstream changes are available.

Allows analysis of TLS traffic with an external tool like Wireshark.

See https://wiki.wireshark.org/TLS#using-the-pre-master-secret

# What changes are included in this PR?
New flag that opts into into the standard `SSLKEYLOGFILE` handling that
other libraries and browsers support.

# Are these changes tested?
Not automatic test, but I did validate that setting the flag AND the env
variable emits a log file that is successfully used by Wireshark to
decrypt the traffic.

# Are there any user-facing changes?
Mostly none for normal users, but might be helpful for developers.
# Rationale for this change

Update the docstring from function write() in struct Writer to reflect
that we write only one RecordBatch at a time as opposed to a vector of
record batches.

# What changes are included in this PR?

Just the comment doc string as above

# Are these changes tested?

yes

# Are there any user-facing changes?

No

---------

Co-authored-by: Andrew Lamb <[email protected]>
Co-authored-by: Matthijs Brobbel <[email protected]>
# Which issue does this PR close?

- Closes apache#8294.

# Rationale for this change

The .NET implementation is extracted to apache/arrow-dotnet from
apache/arrow. apache/arrow will remove `csharp/` eventually. So we
should use apache/arrow-dotnet for integration test.

# What changes are included in this PR?

* Set `ARCHERY_INTEGRATION_WITH_DOTNET=1` to use the .NET implementation
* Checkout apache/arrow-dotnet

# Are these changes tested?

Yes.

# Are there any user-facing changes?

No.
…pache#8257)

# Which issue does this PR close?

- Closes apache#8256 .

# Rationale for this change

Do not compress v2 data page when compress is bad quality ( compressed
size is greater or equal to uncompressed_size )

# What changes are included in this PR?

Discard compression when it's too large

# Are these changes tested?

Covered by existing

# Are there any user-facing changes?

No
# Which issue does this PR close?

- Part of apache#4886

# Rationale for this change

This refactor streamlines the `arrow-avro` writer by introducing a
single, schema‑driven `RecordEncoder` that plans writes up front and
encodes rows using consistent, explicit rules for nullability and type
dispatch. It reduces duplication in nested/struct/list handling, makes
the order of Avro union branches (null‑first vs null‑second) an explicit
choice, and aligns header schema generation with value encoding.

This should improve correctness (especially for nested optionals), make
behavior easier to reason about, and pave the way for future
optimizations.

# What changes are included in this PR?

**High‑level:**

* Introduces a unified, schema‑driven `RecordEncoder` with a builder
that walks the Avro record in Avro order and maps each field to its
Arrow column, producing a reusable write plan. The encoder covers
scalars and nested types (struct, (large) lists, maps,
strings/binaries).
* Applies a single model of **nullability** throughout encoding,
including nested sites (list items, fixed‑size list items, map values),
and uses explicit union‑branch indices according to the chosen order.

**API and implementation details:**

* **Writer / encoder refactor**

* Replaces the previous per‑column/child encoding paths with a
**`FieldPlan`** tree (variants for `Scalar`, `Struct { … }`, and `List {
… }`) and per‑site `nullability` carried from the Avro schema.
* Adds encoder variants for `LargeBinary`, `Utf8`, `Utf8Large`, `List`,
`LargeList`, and `Struct`.
* Encodes union branch indices with `write_optional_index` (writes
`0x00/0x02` according to Null‑First/Null‑Second), replacing the old
branch write.

* **Schema generation & metadata**

* Moves the **`Nullability`** enum to `schema.rs` and threads it through
schema generation and writer logic.
* Adds `AvroSchema::from_arrow_with_options(schema,
Option<Nullability>)` to either reuse embedded Avro JSON or build new
Avro JSON that **honors the requested null‑union order at all nullable
sites**.
* Adds `extend_with_passthrough_metadata` so Arrow schema metadata is
copied into Avro JSON while skipping Avro‑reserved and internal Arrow
keys.
* Introduces helpers like `wrap_nullable` and
`arrow_field_to_avro_with_order` to apply ordering consistently for
arrays, fixed‑size lists, maps, structs, and unions.
 
* **Format and glue**

* Simplifies `writer/format.rs` by removing the `EncoderOptions`
plumbing from the OCF format; `write_long` remains exported for header
writing.

# Are these changes tested?

Yes.

* Adds focused unit tests in `writer/encoder.rs` that verify scalar and
string/binary encodings (e.g., Binary/LargeBinary, Utf8/LargeUtf8) and
validate length/branch encoding primitives used by the writer.
* Round trip integration tests that validate List and Struct decoding in
`writer/mod.rs`.
* Adjusts existing schema tests (e.g., decimal metadata expectations) to
align with the new schema/metadata handling.

# Are there any user-facing changes?

N/A because arrow-avro is not public yet.

---------

Co-authored-by: Ryan Johnson <[email protected]>
Co-authored-by: Matthijs Brobbel <[email protected]>
# Which issue does this PR close?

- Part of apache#4886

# Rationale for this change

Apache Avro’s `decimal` logical type annotates either `bytes` or `fixed`
and carries `precision` and `scale`. Implementations should reject
invalid combinations such as `scale > precision`, and the underlying
bytes are the two’s‑complement big‑endian representation of the unscaled
integer. On the Arrow side, Rust now exposes first‑class `Decimal32`,
`Decimal64`, `Decimal128`, and `Decimal256` data types with documented
maximum precisions (9, 18, 38, 76 respectively). Until now, `arrow-avro`
decoded all Avro decimals to 128/256‑bit Arrow decimals, even when a
narrower type would suffice.

# What changes are included in this PR?

**`arrow-avro/src/codec.rs`**

* Map `Codec::Decimal(precision, scale, _size)` to Arrow’s
`Decimal32`/`64`/`128`/`256` **by precision**, preferring the narrowest
type (≤9→32, ≤18→64, ≤38→128, otherwise 256).
* Strengthen decimal attribute parsing:
  * Error if `scale > precision`.
  * Error if `precision` exceeds Arrow’s maximum (Decimal256).
* If Avro uses `fixed`, check that declared `precision` fits the byte
width (≤4→max 9, ≤8→18, ≤16→38, ≤32→76).
* Update docstring of `Codec::Decimal` to mention `Decimal32`/`64`. 

**`arrow-avro/src/reader/record.rs`**

* Add `Decoder::Decimal32` and `Decoder::Decimal64` variants with
corresponding builders (`Decimal32Builder`, `Decimal64Builder`).
* Builder selection:

* If Avro uses **fixed**: choose by size (≤4→Decimal32, ≤8→Decimal64,
≤16→Decimal128, ≤32→Decimal256).
* If Avro uses **bytes**: choose by declared precision (≤9/≤18/≤38/≤76).
* Implement decode paths that sign‑extend Avro’s two’s‑complement
payload to 4/8 bytes and append values to the new builders; update
`append_null`/`flush` for 32/64‑bit decimals.

**`arrow-avro/src/reader/mod.rs` (tests)**

* Expand `test_decimal` to assert that:

* bytes‑backed decimals with precision 4 map to `Decimal32`; precision
10 map to `Decimal64`;
  * legacy fixed\[8] decimals map to `Decimal64`;
  * fixed\[16] decimals map to `Decimal128`.
* Add a nulls path test for bytes‑backed `Decimal32`.

# Are these changes tested?

Yes. Unit tests under `arrow-avro/src/reader/mod.rs` construct expected
`Decimal32Array`/`Decimal64Array`/`Decimal128Array` with
`with_precision_and_scale`, and compare against batches decoded from
Avro files (including legacy fixed and bytes‑backed cases). The tests
also exercise small batch sizes to cover buffering paths; a new Avro
data file is added for higher‑width decimals.

New Avro test file details:
- test/data/int256_decimal.avro # bytes logicalType:
decimal(precision=76, scale=10)
- test/data/fixed256_decimal.avro # fixed[32] logicalType:
decimal(precision=76, scale=10)
- test/data/fixed_length_decimal_legacy_32.avro # fixed[4] logicalType:
decimal(precision=9, scale=2)
- test/data/int128_decimal.avro # bytes logicalType:
decimal(precision=38, scale=2)

These new Avro test files were created using this script:
https://gist.github.com/jecsand838/3890349bdb33082a3e8fdcae3257eef7

There is also an arrow-testing PR for these new files:
apache/arrow-testing#112

# Are there any user-facing changes?

N/A due to `arrow-avro` not being public.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.