Skip to content

Conversation

galargh
Copy link
Contributor

@galargh galargh commented Jun 28, 2025

Related to filecoin-project/rust-fil-proofs#1775

Similar to filecoin-project/rust-fil-proofs#1785

This PR enables the job that requires running on a machine with a GPU. It will run on a g6e.2xlarge runner.

@galargh
Copy link
Contributor Author

galargh commented Jun 28, 2025

I see that not only the job that I'm migrating to self-hosted runners, but also Clippy and Test in release mode on MacOS have started failing now with the same error:

error[E0282]: type annotations needed
   --> src/seal.rs:339:5
    |
339 |     filecoin_proofs_v1::clear_cache(cache_path)
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot infer type of the type parameter `Tree` declared on the function `clear_cache`
    |
help: consider specifying the generic argument
    |
339 |     filecoin_proofs_v1::clear_cache::<Tree>(cache_path)
    |                                    ++++++++

error[E0282]: type annotations needed
   --> src/seal.rs:426:5
    |
426 |     filecoin_proofs_v1::clear_synthetic_proofs(cache_path)
    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot infer type of the type parameter `Tree` declared on the function `clear_synthetic_proofs`
    |
help: consider specifying the generic argument
    |
426 |     filecoin_proofs_v1::clear_synthetic_proofs::<Tree>(cache_path)
    |                                               ++++++++

Last month they were still passing on the default branch: https://github.com/filecoin-project/rust-filecoin-proofs-api/actions/runs/15213629917

@vmx Do you have an idea why that might be?

cc @BigLep

@vmx
Copy link
Contributor

vmx commented Jun 30, 2025

Things should work, but the current master patches the rust-fil-proofs crates to point to the master branch:

[patch.crates-io]
filecoin-proofs = { git = "https://github.com/filecoin-project/rust-fil-proofs" }
fr32 = { git = "https://github.com/filecoin-project/rust-fil-proofs" }
filecoin-hashers = { git = "https://github.com/filecoin-project/rust-fil-proofs" }
storage-proofs-core = { git = "https://github.com/filecoin-project/rust-fil-proofs" }
.

As there was a release of rust-fil-proofs, this repo should be updated to use the released version. So I suggest that the current maintainers update to those versions and the we'll see if things still fail.

@galargh
Copy link
Contributor Author

galargh commented Jul 7, 2025

I added a release commit here - c5246a9 - and the workflow now passes.

Now, the question is, what the release process is that we should follow? I see that the previous tags were created manually - https://github.com/filecoin-project/rust-filecoin-proofs-api/tags

@galargh galargh marked this pull request as ready for review July 7, 2025 13:42
@galargh galargh requested review from Copilot, vmx and BigLep July 7, 2025 13:42
Copilot

This comment was marked as outdated.

@BigLep
Copy link
Member

BigLep commented Jul 7, 2025

Thanks for the updates @galargh.

@galargh : Would it maybe make sense to break this into two PRs (one for the release, and one for the CI adjustment)?

@vmx : are there any steps we should follow for making releases (e.g., any cargo release commands like with rust-fil-proofs)?

@BigLep BigLep requested a review from rvagg July 7, 2025 16:56
@BigLep BigLep added this to FilOz Jul 7, 2025
@github-project-automation github-project-automation bot moved this to 📌 Triage in FilOz Jul 7, 2025
@BigLep BigLep moved this from 📌 Triage to ⌨️ In Progress in FilOz Jul 7, 2025
@rvagg
Copy link
Member

rvagg commented Jul 8, 2025

The release commit in here had me confused, but I think we're just trying to do two separate things at once? I don't think I mind as long as it's not squash merged, but it would have been clearer for reviewing if they were separate PRs.

My only question is about the nvidia drivers--are they already installed on the standard GitHub machines but not on the current AMI that we have access to?

@vmx
Copy link
Contributor

vmx commented Jul 8, 2025

I would do the release separately. It's done similarly as for rust-fil-proofs. It's even simpler. It's a matter of cargo release major. It would do the version bump in the Cargo.toml, push that tags etc.

@galargh
Copy link
Contributor Author

galargh commented Jul 15, 2025

The release commit in here had me confused, but I think we're just trying to do two separate things at once? I don't think I mind as long as it's not squash merged, but it would have been clearer for reviewing if they were separate PRs.

Yes, that was the plan all along. I just wanted to see whether the release fixes the build on self-hosted runners. Here's the separate PR that I just created: #109

My only question is about the nvidia drivers--are they already installed on the standard GitHub machines but not on the current AMI that we have access to?

The hosted runners do not have GPUs. On self-hosted ones, we do have dedicated GPUs - that's what we install the drivers for.

@BigLep BigLep moved this from ⌨️ In Progress to 🔎 Awaiting Review in FilOz Jul 15, 2025
@BigLep
Copy link
Member

BigLep commented Jul 22, 2025

@galargh : It looks like the content of #109 is still in this PR. Once it's removed here, I tink we can approve and get this merged.

@BigLep BigLep moved this from 🔎 Awaiting Review to ⌨️ In Progress in FilOz Jul 22, 2025
@galargh galargh marked this pull request as draft July 23, 2025 13:57
@galargh
Copy link
Contributor Author

galargh commented Jul 23, 2025

@galargh : It looks like the content of #109 is still in this PR. Once it's removed here, I tink we can approve and get this merged.

I moved this one back to draft for now as it's effectively blocked until we get the release out.

@galargh galargh marked this pull request as ready for review August 10, 2025 16:10
@BigLep BigLep requested a review from Copilot August 10, 2025 22:33
@BigLep BigLep moved this from ⌨️ In Progress to 🔎 Awaiting Review in FilOz Aug 10, 2025
Copilot

This comment was marked as outdated.

Comment on lines +72 to +74
- uses: dtolnay/rust-toolchain@21dc36fb71dd22e3317045c0c31a3f4249868b17
with:
toolchain: 1.83
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this kind of sucks that we can't just use the rust-toolchain file for versioning, but note from https://github.com/dtolnay/rust-toolchain?tab=readme-ov-file#inputs about versioning:

Rustup toolchain specifier e.g. stable, nightly, 1.42.0, nightly-2022-01-01. Important: the default is to match the @Rev as described above. When passing an explicit toolchain as an input instead of @Rev, you'll want to use "dtolnay/rust-toolchain@master" as the revision of the action.

i.e. it wants you to use dtolnay/[email protected] instead.

(I also notice other poeple are annoyed by this gap).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd stick with using a pinned sha of rust-toolchain despite the suggestion from the action authors as this is a more secure alternative.

I do like the idea of using the version from the toml config file! I'll check it out and update where applicable if it looks solid.

@BigLep BigLep requested review from Copilot and rvagg August 19, 2025 05:03
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enables GPU testing on self-hosted AWS runners by migrating the test job from GitHub-hosted Ubuntu runners to self-hosted GPU-enabled machines. It also includes some minor cleanup and formatting improvements.

  • Migrates test job to run on self-hosted runners with GPU support (g6e.2xlarge)
  • Adds CUDA driver installation steps for the self-hosted environment
  • Updates package installation commands and adds Rust toolchain setup for self-hosted runners

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@github-project-automation github-project-automation bot moved this from 🔎 Awaiting Review to ✔️ Approved by reviewer in FilOz Aug 20, 2025
@galargh galargh merged commit 02fa028 into master Aug 20, 2025
4 checks passed
@github-project-automation github-project-automation bot moved this from ✔️ Approved by reviewer to 🎉 Done in FilOz Aug 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🎉 Done
Development

Successfully merging this pull request may close these issues.

4 participants