Skip to content

Conversation

@lriggs
Copy link
Contributor

@lriggs lriggs commented Nov 3, 2025

Rationale for this change

I believe there is a bug in LLVM 20 where a memory leak can occur on certain architectures.

My company's product uses Gandiva and I recently ran into a bug involving Gandiva, the ORC v2/LLJIT compiler and LLVM versions. I was not able to write a unit test to reproduce the bug outside of our code, though I believe there is a high potential for it to reproduce. It is a memory leak, eventually leading to a crash after running several thousand queries. It only occurred on x86 architectures.

What changes are included in this PR?

Adding a VCPKG overlay for LLVM 21 (since its not bundled into an official release yet).

The Mac CI builds (in arrow-java for example) use Brew which does not have a 'llvm@21' named package. Installing 'llvm@21' currently resolves to 'llvm' which only provides an llvm library which includes a runtime dependency on Z3. The @ versioned llvms in brew do not do this. This PR will update the mac builds to use vcpkg to install just LLVM in order to fix that and to also ensure parity between the Linux and Mac LLVM versions.

Is removing llvm from cpp/Brewfile a problem in this case? Maybe for local mac builds? I'm not sure what to leave here.

Are these changes tested?

TBD

Are there any user-facing changes?

No.

@github-actions
Copy link

github-actions bot commented Nov 3, 2025

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

Copy link
Member

@raulcd raulcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand lots of this setup is required just to test with LLVM 21, right?

We do have an extra CI test that runs LLVM 21.1.0 (or latest on conda) with the conda-cpp docker compose service. It can be run locally with archery or directly with docker compose:

$ UBUNTU=22.04 && archery docker run conda-cpp

We did a fix for LLVM 21.1.0 on Arrow 22.0.0 which was triggers by that CI job failing.
#47469

From what I understand you are saying there is a bug with LLVM 21. Can we reproduce the problem adding the test and validating on that docker image?

Your current proposed change seems to require a lot of patches / vcpkg overlays which to be honest add a little bit of maintenance and seems tangential to Arrow.

If we could add the test, validate the failure on the existing job and apply the patch without the rest that would be awesome.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Nov 4, 2025
@raulcd
Copy link
Member

raulcd commented Nov 4, 2025

@github-actions crossbow submit test-conda-cpp

@github-actions
Copy link

github-actions bot commented Nov 4, 2025

Revision: f950e4d

Submitted crossbow builds: ursacomputing/crossbow @ actions-37f00f5d12

Task Status
test-conda-cpp GitHub Actions

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Nov 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants