Skip to content

Conversation

@DJMcNab
Copy link
Member

@DJMcNab DJMcNab commented Oct 6, 2025

Fixes #74

A couple of notes:

  1. For ease, I removed simd_dispatch in this PR. As such, I don't want to land this before #simd > v0.3.0
  2. The cfg variants in here are gnarly; I believe that's unavoidable to express what we want. We do need to be clear that we are doing runtime dispatch properly
  3. The automated testing story of this PR is an open question. I have a manual test, in the new sh ./check_targets.sh at the root. I want to defer running this in CI; I suspect it would be very expensive.

This does not address #92. This does however add a force_support_fallback feature, so it doesn't completely remove the fallback functionality.

@DJMcNab
Copy link
Member Author

DJMcNab commented Oct 6, 2025

Still outstanding is the changelog. I'm planning to leave this until after we get v0.3.0 out the door.

@DJMcNab DJMcNab requested a review from Ralith October 14, 2025 17:01
@ajakubowicz-canva ajakubowicz-canva self-requested a review October 15, 2025 22:58
Copy link
Collaborator

@ajakubowicz-canva ajakubowicz-canva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fantastic - and reading through this, I really like it.

As a side note, this is extremely hard to test haha. I ran your script and also poked around the compiled result but I cannot statically determine if all code paths are behaving as expected. However, I think that this is overall a great improvement.

Thank you!

Edit: I am least confident with the x86 changes. Those might be good candidates for a double check from someone else.

#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
#[cfg(all(
any(target_arch = "x86", target_arch = "x86_64"),
not(all(target_feature = "avx2", target_feature = "fma"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify - this line ensures that if avx2 and fma are not both detected we want to fall through to the Avx2 case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This says "if our target doesn't statically support [both avx2 and fma], then we might be on a machine which only has SSE4.2 support. As such, we need to check if the currently detected Level is SSE4.2."

That is, we only need to even support the SSE4.2 level if we don't have support for the better level.

This is more closely documented on Level - I'll add a comment pointing there.

Comment on lines +131 to +134
// This macro turns whether the `force_support_fallback` macro is enabled into a boolean literal
// in `dispatch`, which allows it to be used correctly cross-crate.
// This trickery is required because macros are expanded in the context of the calling crate, including for
// evaluating `cfg`s.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern is so nice!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I thought I was going to have to make the macro be truly awful for it to work, but the "inversion of control" paradigm luckily worked perfectly here.

/// Note that this is unaffected by the `force-support-fallback` feature.
/// Instead, you should use [`Level::fallback`] if you require the fallback level.
pub const fn baseline() -> Self {
// TODO: How do we possibly test that this method works in all cases?
Copy link
Collaborator

@ajakubowicz-canva ajakubowicz-canva Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! But from carefully reading through this PR the various cfg cases look correct to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a pointer to check_targets.sh as well.

Comment on lines 336 to 338
// TODO: Level::Avx2(avx2) => Some(...)
#[cfg(not(all(target_feature = "avx2", target_feature = "fma")))]
Level::Sse4_2(sse42) => Some(sse42),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on the TODO? (just in the comment)
Can it be deleted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed the issue. I didn't want to write a justification here whilst #108 was still in flight, but that's now addressed.

@DJMcNab
Copy link
Member Author

DJMcNab commented Oct 16, 2025

I'm going to merge this now; I've pushed a few minor changes in 69ee7f0 (plus a fixup in adcb7f1), but I don't think those should block forward progress here.

@DJMcNab DJMcNab added this pull request to the merge queue Oct 16, 2025
Merged via the queue into linebender:main with commit a07870f Oct 16, 2025
18 checks passed
@DJMcNab DJMcNab deleted the baseline_level branch October 16, 2025 08:56
@ajakubowicz-canva
Copy link
Collaborator

LGTM! Great stuff!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Avoid compiling Level::Fallback if the compilation target requires our target features

3 participants