Skip to content

Conversation

Silic0nS0ldier
Copy link
Contributor

@Silic0nS0ldier Silic0nS0ldier commented Sep 29, 2025

The flag toolchain_resolution_debug accepts regex (technically a superset supporting -, + and , in specific contexts) which the missing toolchains message does not handle.

This is a problem for toolchain types with labels like @@rules_shell+//shell:toolchain_type as + (match 1 or more of the proceeding token) cannot be used directly, and such labels are highly likely to be encountered as + is in just about every canonical repository name.

To solve this problem each label string is run through Pattern::quote, wrapping values with \Q and \E to denote a literal sequence (in PCRE compliant regex engines like the Java standard library uses).

e.g.

No matching toolchains found for types:
  @@rules_shell+//shell:toolchain_type
To debug, rerun with --toolchain_resolution_debug='\Q@@rules_shell+//shell:toolchain_type\E'
For more information on platforms or toolchains see https://bazel.build/concepts/platforms-intro.

Note that Pattern::quote always wraps the input, even if the input entirely consists of literal characters (fun fact: \ is a valid literal except in JS and PHP regex flavors). To avoid unnecessary quoting a naive check for \{[(^$.?+*| characters is done to determine if quoting is necessary.

@github-actions github-actions bot added the awaiting-review PR is awaiting review from an assigned reviewer label Sep 29, 2025
@Silic0nS0ldier
Copy link
Contributor Author

--toolchain_resolution_debug should for most be a pretty niche flag but if optimising for UX is desired I can put together a basic "is Q-E needed" check. I've chosen not to in the original implementation as it will be more code to maintain (and knowing what regex is like, I'd probably overlook something and allow an edge case through).

@fmeum fmeum requested a review from gregestren September 29, 2025 12:48
@gregestren gregestren added the team-Configurability platforms, toolchains, cquery, select(), config transitions label Sep 29, 2025
@gregestren
Copy link
Contributor

--toolchain_resolution_debug should for most be a pretty niche flag but if optimising for UX is desired

I'll respond properly to this PR when I get a moment later today. But I just want to express I'm strongly in favor of optimizing the UX. This flag spits out a ton of information that's both a) extremely useful when you need it and b) extremely dense and hard to follow.

Ideally anyone should be able to use it to diagnose their toolchain registration setup. This flag is still very much the realm of power users now. But the info it provides is broadly useful to everyone.

@gregestren
Copy link
Contributor

+ is in just about every canonical repository name.

I'll plead ignorance but why is this?

Note that Pattern::quote always wraps the input, even if the input entirely consists of literal characters

Are you saying the suggestion message will always add \Q \E even for simple labels?

I can put together a basic "is Q-E needed" check.

Is that referencing my above question?

@Silic0nS0ldier
Copy link
Contributor Author

Silic0nS0ldier commented Sep 30, 2025

  • is in just about every canonical repository name.

I'll plead ignorance but why is this?

For the benefit of all potential readers I'll include some background.

Bzlmod (opt-in since Bazel 6, default in Bazel 7) introduced the concept of "canonical" and "apparent" repository names in order to collisions across module boundaries. e.g.

@@rules_foo+foo_ext+foo_repo
  ^         ^       ^ given name i.e. `name = "foo_repo"`
  ^         ^ module extension repository was defined within
  ^ name of module containing module extension i.e. `name = "rules_foo"`

Other repositories with a dependency on the latter can reference the latter using the apparent repository name @foo_repo. The canonical form can also be used but as implementation detail should never be in committed sources (it bypasses the implicit dependency guards afforded by apparent repository name mapping and can change between releases like with the move from ~ to + as the delimiter).

In the context of this PR, most toolchains users will be exposed to such labels via third party modules. e.g.

@rules_shell//shell:toolchain_type
@rules_nodejs//nodejs:toolchain_type
@bazel_tools//tools/jdk:toolchain_type

--toolchain_resolution_debug's regex matching deals exclusively with canonicalised labels (possibly excepting toolchains in the main repository going off Bazel's own tests, although I have a hunch that may just be a defective test) so the default missing toolchain error's advice message must print the canonical form (plus the missing toolchain could be from a transitive dependency).

That is why + comes up so often (the "just about every" is down to me being unsure about the main module). Special regex characters can also come from user input as labels accept characters which have special meaning for regex.

Note that Pattern::quote always wraps the input, even if the input entirely consists of literal characters

Are you saying the suggestion message will always add \Q \E even for simple labels?

I can put together a basic "is Q-E needed" check.

Is that referencing my above question?

For (1) yes, even for simple labels. For (2), yes it does reference your above question.

From a quick look at https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/util/regex/Pattern.html checking for the presence of the following characters should suffice for an "is this necessary" check. \{[(^$.?+*| (there is technically more but I'm fairly sure they only have special meaning within {[()

@gregestren
Copy link
Contributor

  • is in just about every canonical repository name.

I'll plead ignorance but why is this?

For the benefit of all potential readers I'll include some background.

Bzlmod (opt-in since Bazel 6, default in Bazel 7) introduced the concept of "canonical" and "apparent" repository names in order to collisions across module boundaries. e.g.

@@rules_foo+foo_ext+foo_repo
  ^         ^       ^ given name i.e. `name = "foo_repo"`
  ^         ^ module extension repository was defined within
  ^ name of module containing module extension i.e. `name = "rules_foo"`

Oh, I see. Extensions add the +: https://bazel.build/external/extension#repository_names_and_visibility.

Repos that aren't generated from extensions wouldn't generally have +, then? Isn't that also a common case?

Not trying to be nitpicky. I just want to make sure I understand the space right.

From a quick look at https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/util/regex/Pattern.html checking for the presence of the following characters should suffice for an "is this necessary" check. \{[(^$.?+*| (there is technically more but I'm fairly sure they only have special meaning within {[()

I support adding this check, and only suggesting that wrapping when needed.

IMO that offers a gentler interface to this flag for simple cases.

@Wyverald
Copy link
Member

Repos that aren't generated from extensions wouldn't generally have +, then?

They do. https://bazel.build/external/module#repository_names_and_strict_deps

@Silic0nS0ldier Silic0nS0ldier force-pushed the quote-toolchain-resolution-debug-suggestion branch from 132edc7 to 309ad8e Compare October 2, 2025 11:35
@Silic0nS0ldier Silic0nS0ldier force-pushed the quote-toolchain-resolution-debug-suggestion branch from 309ad8e to 4cf7f09 Compare October 2, 2025 11:37
@Silic0nS0ldier
Copy link
Contributor Author

Silic0nS0ldier commented Oct 2, 2025

Conditional regex logic has been added, along with a test case that specifically targets the exception (existing test cases focus on resolution, overkill for this change).

@gregestren gregestren added awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally and removed awaiting-review PR is awaiting review from an assigned reviewer labels Oct 2, 2025
@copybara-service copybara-service bot closed this in 05c59ca Oct 3, 2025
@github-actions github-actions bot removed the awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally label Oct 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

team-Configurability platforms, toolchains, cquery, select(), config transitions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants