Skip to content

Conversation

@vadorovsky
Copy link
Member

@vadorovsky vadorovsky commented Oct 22, 2025

To allow easy builds of LLVM locally, move the logic of fetching the sources and executing CMake from the workflow YAML to an xtask command.


This change is Reviewable

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tamird reviewed 4 of 5 files at r1.
Reviewable status: 3 of 6 files reviewed, 5 unresolved discussions


xtask/src/main.rs line 128 at r1 (raw file):

fn download_llvm_sources() -> Result<Vec<u8>> {
    /// Branch of [Aya's LLVM fork][aya-llvm] to be used. It should correspond

what is this link? anyway see the comment in llvm.yml, this breaks the connection between cache key and actual version used.


Cargo.toml line 85 at r1 (raw file):

# dev deps
flate2 = { version = "1.1.4", default-features = false, features = [
    "rust_backend"

features should be in xtask, not here -- right? same for reqwest


.github/workflows/llvm.yml line 55 at r1 (raw file):

        if: steps.cache-llvm.outputs.cache-hit != 'true'
        run: |
          set -euxo pipefail

this is just one command now, so you can make it single-line and remove this


.github/workflows/llvm.yml line 56 at r1 (raw file):

        run: |
          set -euxo pipefail
          cargo xtask build-llvm --install-dir \

s/install-dir/install-prefix/ to match the cmake terminology


.github/workflows/llvm.yml line 52 at r1 (raw file):

          repository: aya-rs/llvm-project
          ref: ${{ steps.ls-remote.outputs.sha }}
          path: llvm-project

are you sure you don't want to leave this here? you've broken the causal link between the cache key and the llvm sha

Code quote:

      - name: Checkout LLVM Source
        if: steps.cache-llvm.outputs.cache-hit != 'true'
        uses: actions/checkout@v5
        with:
          repository: aya-rs/llvm-project
          ref: ${{ steps.ls-remote.outputs.sha }}
          path: llvm-project

@vadorovsky vadorovsky force-pushed the xtask-llvm branch 3 times, most recently from 4ba8caa to 8ad492f Compare October 22, 2025 16:51
Copy link
Member Author

@vadorovsky vadorovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 6 files reviewed, 5 unresolved discussions (waiting on @tamird)


Cargo.toml line 85 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

features should be in xtask, not here -- right? same for reqwest

Done.


.github/workflows/llvm.yml line 52 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

are you sure you don't want to leave this here? you've broken the causal link between the cache key and the llvm sha

Good call. After giving it a thought I decided to split this thing into two subcommands:

  • cargo xtask llvm download-sources that fetches the git repo (for the convenience for quick local testing or for people to build for exotic arches)
  • cargo xtask llvm build that requires the source

And we can use just the latter in the CI. How does it sound?


.github/workflows/llvm.yml line 55 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

this is just one command now, so you can make it single-line and remove this

I removed this line, but I still want to keep the thing multi-line, so I can split the long command.


.github/workflows/llvm.yml line 56 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

s/install-dir/install-prefix/ to match the cmake terminology

Done.


xtask/src/main.rs line 128 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

what is this link? anyway see the comment in llvm.yml, this breaks the connection between cache key and actual version used.

Removed in favor of a default value in the download-sources command.

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tamird reviewed 4 of 5 files at r6.
Reviewable status: 3 of 6 files reviewed, 5 unresolved discussions (waiting on @vadorovsky)


.github/workflows/llvm.yml line 52 at r1 (raw file):

Previously, vadorovsky (Michal R) wrote…

Good call. After giving it a thought I decided to split this thing into two subcommands:

  • cargo xtask llvm download-sources that fetches the git repo (for the convenience for quick local testing or for people to build for exotic arches)
  • cargo xtask llvm build that requires the source

And we can use just the latter in the CI. How does it sound?

I think that's fine in principle but I am struggling to understand why we want this thing to manage the downloads for us. There are two ways to download llvm:

  • git shallow clone
  • download a tarball

It seems strictly worse to put these into rust because it forces us to compile TLS into xtask, and also precludes us from doing ETag-based caching. I'd just punt this for now, unless there is a particular use case?


.github/workflows/llvm.yml line 55 at r1 (raw file):

Previously, vadorovsky (Michal R) wrote…

I removed this line, but I still want to keep the thing multi-line, so I can split the long command.

In that case you might need set -euxo pipefail or else i'm not sure this will exit non-zero on failure


xtask/src/main.rs line 153 at r6 (raw file):

}

fn unpack_llvm_sources(src_dir: &Path, data: Vec<u8>) -> Result<()> {

can we avoid single-use functions? they force arbitrary interfaces that sometimes pessimize (e.g. here you force vector which isn't obviously optimal)


xtask/src/main.rs line 157 at r6 (raw file):

    let mut archive = Archive::new(gz_decoder);

    let pb = progress_bar_spinner("{spinner:.green} [{elapsed_precise}] unpacking...");

what is elapsed_precise?


xtask/src/main.rs line 186 at r6 (raw file):

    let client = reqwest::blocking::Client::new();

    let url = format!("https://github.com/{git_repo}/archive/refs/heads/{git_ref}.tar.gz");

hm. can you help me understand why this is preferable to using git to fetch this (using a shallow clone)?

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tamird reviewed 1 of 5 files at r6, 1 of 1 files at r7, 1 of 1 files at r8, 1 of 1 files at r9, 1 of 1 files at r10, all commit messages.
Reviewable status: all files reviewed, 16 unresolved discussions (waiting on @vadorovsky)


README.md line 43 at r9 (raw file):
What is this paragraph saying? Maybe you want to say

You may need to build LLVM from source if a package is not offered by your distribution.

FWIW this writeup is very linux-centric, but building on macOS is totally normal and I do it all the time :)

Code quote:

Different operating systems and their distributions might or might not provide
LLVM packages in recent version. If there is no package for your system, you
might want to try other methods.

xtask/src/main.rs line 177 at r6 (raw file):

}

fn download_llvm_sources(options: DownloadLlvmSources) -> Result<()> {

note to self: have not reviewed this because i don't think we should do it here


xtask/src/main.rs line 216 at r6 (raw file):

}

fn execute_command(cmd: &mut Command) -> Result<()> {

hmm. why do this rather than just let the command inherit stdout and stderr?


xtask/src/main.rs line 255 at r6 (raw file):

/// This whole dance would be simpler if CMake supported
/// `CMAKE_INSTALL_MODE=MOVE`.
fn move_absolute_symlink_targets(install_prefix: &Path) -> Result<()> {

can we inline this below? it's tightly coupled with below, and even the variable name install_prefix suggests it's not very general


xtask/src/main.rs line 259 at r6 (raw file):

        .follow_links(false)
        .into_iter()
        .filter_map(Result::ok)

why are we skipping errors?


xtask/src/main.rs line 260 at r6 (raw file):

        .into_iter()
        .filter_map(Result::ok)
        .filter(|e| e.file_type().is_symlink())

just do a continue below; half combinator and half loop means it's maximally hard to read


xtask/src/main.rs line 266 at r6 (raw file):

            .with_context(|| format!("failed to read the link {}", link_path.display()))?;
        if target.is_absolute() {
            // Move the file the link points to into the location of the symlink

i think you can remove this


xtask/src/main.rs line 286 at r6 (raw file):

    } = options;

    let build_dir = src_dir.join("build");

previously we were doing an out-of-source build, now we're putting the build inside the source tree. why?


xtask/src/main.rs line 289 at r6 (raw file):

    let mut install_arg = OsString::from("-DCMAKE_INSTALL_PREFIX=");
    install_arg.push(install_prefix.as_os_str());

is this guaranteed to be absolute? does it need to be? if not, what is it relative to?


README.md line 53 at r9 (raw file):

#### Building LLVM from source

LLVM can be built from source using the `llvm` xtask, included in bpf-linker

"using the `xtask llvm`` subcommand"


README.md line 70 at r9 (raw file):

After that, bpf-linker can be build with the LLVM_SYS_211_PREFIX environment

s/build/built/

Copy link
Member Author

@vadorovsky vadorovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 6 files reviewed, 16 unresolved discussions (waiting on @tamird)


README.md line 43 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

What is this paragraph saying? Maybe you want to say

You may need to build LLVM from source if a package is not offered by your distribution.

FWIW this writeup is very linux-centric, but building on macOS is totally normal and I do it all the time :)

I didn't mean it to be Linux-centric, hence I started with "Different operating systems". :) How about this wording?

Code snippet:

You may need to build LLVM from source if a recent version is not available
through any package manager that supports your platform.

README.md line 53 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

"using the `xtask llvm`` subcommand"

Done, with the new subcommand name


README.md line 70 at r9 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

s/build/built/

Done.


.github/workflows/llvm.yml line 52 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

I think that's fine in principle but I am struggling to understand why we want this thing to manage the downloads for us. There are two ways to download llvm:

  • git shallow clone
  • download a tarball

It seems strictly worse to put these into rust because it forces us to compile TLS into xtask, and also precludes us from doing ETag-based caching. I'd just punt this for now, unless there is a particular use case?

Alright, let's remove it. I documented how to clone our LLVM repo in README.


.github/workflows/llvm.yml line 55 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

In that case you might need set -euxo pipefail or else i'm not sure this will exit non-zero on failure

Done.


xtask/src/main.rs line 153 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

can we avoid single-use functions? they force arbitrary interfaces that sometimes pessimize (e.g. here you force vector which isn't obviously optimal)

Done.

Usually I prefer having a single function for single task, but given that I removed the download functionality now, the only separate function left was the one for moving symlinks.


xtask/src/main.rs line 157 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

what is elapsed_precise?

A template keyword from indicatif. I removed this code now though.


xtask/src/main.rs line 177 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

note to self: have not reviewed this because i don't think we should do it here

Done, the code is removed


xtask/src/main.rs line 186 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

hm. can you help me understand why this is preferable to using git to fetch this (using a shallow clone)?

Removed


xtask/src/main.rs line 216 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

hmm. why do this rather than just let the command inherit stdout and stderr?

That's what we do in aya-build, but given that is xtask is simple and it's only job is to run CMake, we don't need any custom logging. Removed.


xtask/src/main.rs line 255 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

can we inline this below? it's tightly coupled with below, and even the variable name install_prefix suggests it's not very general

Done


xtask/src/main.rs line 259 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

why are we skipping errors?

Done, I'm handling the error now


xtask/src/main.rs line 260 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

just do a continue below; half combinator and half loop means it's maximally hard to read

Done


xtask/src/main.rs line 266 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

i think you can remove this

You mean the target.is_absolute() part? It's there for a reason. Removing that part triggers this error:

Error: failed to move the target file libRemarks.so.22.0git to the location of the symlink llvm-install/lib/libRemarks.so

Caused by:
    No such file or directory (os error 2)

And that's because some binaries and library files have relative symlinks to each other. E.g. libRemarks.so.22.0git symlinks to libRemarks.so relatively.

So I think the right thing to do is to move absolute symlinks (the actual binaries/artifacts) and leave the relative symlinks alone.


xtask/src/main.rs line 286 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

previously we were doing an out-of-source build, now we're putting the build inside the source tree. why?

No particular reason. I added an argument for build dir now. Let me know if you have any other ideas.


xtask/src/main.rs line 289 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

is this guaranteed to be absolute? does it need to be? if not, what is it relative to?

It's relative to the current directory you are in - which usually would be the top dir of bpf-linker repo, where you execute cargo xtask. I think that's fine, when I type cargo xtask build-llvm --install-dir llvm-install, my expectation would be llvm-install dir being crated in the current dir, and that is happening now.

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tamird reviewed 6 of 6 files at r11, 1 of 1 files at r12, 1 of 1 files at r13, 1 of 1 files at r14, 1 of 1 files at r15, all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @vadorovsky)


xtask/src/main.rs line 216 at r6 (raw file):

Previously, vadorovsky (Michal R) wrote…

That's what we do in aya-build, but given that is xtask is simple and it's only job is to run CMake, we don't need any custom logging. Removed.

Ack, I think in aya-build we do this because we are spawning the command, which we do not do here.


xtask/src/main.rs line 266 at r6 (raw file):

Previously, vadorovsky (Michal R) wrote…

You mean the target.is_absolute() part? It's there for a reason. Removing that part triggers this error:

Error: failed to move the target file libRemarks.so.22.0git to the location of the symlink llvm-install/lib/libRemarks.so

Caused by:
    No such file or directory (os error 2)

And that's because some binaries and library files have relative symlinks to each other. E.g. libRemarks.so.22.0git symlinks to libRemarks.so relatively.

So I think the right thing to do is to move absolute symlinks (the actual binaries/artifacts) and leave the relative symlinks alone.

I meant the comment.


README.md line 55 at r14 (raw file):

in bpf-linker sources.

First, the LLVM sources offered in [our fork][llvm-fork] need to be cloned,

s/need/needs/


README.md line 60 at r14 (raw file):

```sh
git clone -b rustc/21.1-2025-08-01 https://github.com/aya-rs/llvm-project ~/llvm-project

eh, why suggest ~?


xtask/src/main.rs line 113 at r11 (raw file):

    } = options;

    // let build_dir = src_dir.join("build");

remove this comment?


xtask/src/main.rs line 173 at r11 (raw file):

    for entry in WalkDir::new(&install_prefix)
        .follow_links(false)
        .into_iter()

just curious: do you need this into_iter call?


xtask/src/main.rs line 189 at r11 (raw file):

            .with_context(|| format!("failed to read the link {}", link_path.display()))?;
        if target.is_absolute() {
            // Move the file the link points to into the location of the symlink

as above: i think you can remove the comment

To allow easy builds of LLVM locally, move the logic of fetching the
sources and executing CMake from the workflow YAML to an xtask command.
* Make the main `BPF Linker` the only H1 header.
* Keep all the other sections as H2.
* Adjust the levers of subsections to match the new layout.
* Avoid duplicating the `Usage` title, name the latter section
  `CLI syntax`.
Without rustc-llvm-proxy, providing a feature determining the LLVM
version is mandatory.
Copy link
Member Author

@vadorovsky vadorovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 3 of 6 files reviewed, 6 unresolved discussions (waiting on @tamird)


README.md line 55 at r14 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

s/need/needs/

"Sources" are plural.

Screenshot_2025-10-23_16-01-03.png


README.md line 60 at r14 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

eh, why suggest ~?

Good call, I also added ./ everywhere for clarity.


xtask/src/main.rs line 266 at r6 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

I meant the comment.

Done


xtask/src/main.rs line 113 at r11 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

remove this comment?

Done


xtask/src/main.rs line 173 at r11 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

just curious: do you need this into_iter call?

I don't, removing it works fine.


xtask/src/main.rs line 189 at r11 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

as above: i think you can remove the comment

Done

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tamird reviewed 3 of 3 files at r16, 1 of 1 files at r17, 1 of 1 files at r18, 1 of 1 files at r19, 1 of 1 files at r20, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @vadorovsky)


README.md line 55 at r14 (raw file):

Previously, vadorovsky (Michal R) wrote…

"Sources" are plural.

Screenshot_2025-10-23_16-01-03.png

Ah, I misread it as "source"

@vadorovsky vadorovsky marked this pull request as ready for review October 23, 2025 14:22
@vadorovsky vadorovsky merged commit 35cfe2b into aya-rs:main Oct 23, 2025
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants