-
-
Notifications
You must be signed in to change notification settings - Fork 807
feat(yaml_parser): restructure yaml property #8570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
WalkthroughThis PR refactors the YAML parser to introduce a unified PropertyList mechanism for handling anchor and tag properties across both block and flow parsing paths. It consolidates property handling into a single list type (replacing separate anchor-first and tag-first variants), updates block-in-block node structures to explicitly carry properties alongside content, and modifies flow node parsing to accept and propagate property context through the parsing pipeline. Grammar and syntax kind definitions are updated to reflect these structural changes. Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (46)
crates/biome_yaml_factory/src/generated/node_factory.rsis excluded by!**/generated/**,!**/generated/**and included by**crates/biome_yaml_factory/src/generated/syntax_factory.rsis excluded by!**/generated/**,!**/generated/**and included by**crates/biome_yaml_parser/tests/yaml_test_suite/err/block/extra_characters_in_header.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/err/block/invalid_headers.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_followed_plain.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_key_contains_multiple_values.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_missing_colon.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/err/block/multi_line_flow_collection_broken_out_of_parent_block.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/err/block/multi_line_plain_token_broken_out_parent_flow.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/err/block/plain_followed_mapping.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/err/block/plain_separated_by_comments.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/err/block/sequence_follow_by_map.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/err/flow/empty_mapping_entry.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/err/flow/mapping_missing_closing_bracket.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/err/flow/missing_closing_double_quoted.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/err/flow/missing_closing_single_quoted.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/err/flow/plain_start_with_indicator.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/err/flow/sequence_missing_closing_bracket.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/explicit_block_mapping.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/explicit_block_mapping_with_compact_mapping.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/explicit_block_mapping_with_empty_value.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/explicit_block_mapping_with_empty_value_and_mapping_as_key.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/implicit_block_mapping_empty_value.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/nested_implicit_block_mapping.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/simple_block_mapping.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/simple_block_mapping_separated_by_trivia.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/scalar/folded_scalar.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/scalar/literal_scalar.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/map_nested_in_sequence.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/nested_block_sequence.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/sequence_nested_in_map.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/sequence_with_trailing_trivia.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/simple_block_sequence.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/document/separated_by_doc_end.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/double_quote.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/explicit_mapping.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/flow_with_trailing_trivia.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/nested_mapping.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/simple_mapping.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/simple_plain.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/simple_sequence.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/single_quote.yaml.snapis excluded by!**/*.snapand included by**crates/biome_yaml_syntax/src/generated/kind.rsis excluded by!**/generated/**,!**/generated/**and included by**crates/biome_yaml_syntax/src/generated/macros.rsis excluded by!**/generated/**,!**/generated/**and included by**crates/biome_yaml_syntax/src/generated/nodes.rsis excluded by!**/generated/**,!**/generated/**and included by**crates/biome_yaml_syntax/src/generated/nodes_mut.rsis excluded by!**/generated/**,!**/generated/**and included by**
📒 Files selected for processing (4)
crates/biome_yaml_parser/src/parser/block.rscrates/biome_yaml_parser/src/parser/flow.rsxtask/codegen/src/yaml_kinds_src.rsxtask/codegen/yaml.ungram
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (CONTRIBUTING.md)
**/*.rs: Use inline rustdoc documentation for rules, assists, and their options
Use thedbg!()macro for debugging output in Rust tests and code
Use doc tests (doctest) format with code blocks in rustdoc comments; ensure assertions pass in tests
Files:
xtask/codegen/src/yaml_kinds_src.rscrates/biome_yaml_parser/src/parser/flow.rscrates/biome_yaml_parser/src/parser/block.rs
🧠 Learnings (9)
📚 Learning: 2025-11-24T18:06:03.545Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_parser/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:06:03.545Z
Learning: Applies to crates/biome_parser/**/src/**/*.rs : Use `ParseSeparatedList` and `ParseNodeList` for parsing lists with error recovery to avoid infinite loops
Applied to files:
crates/biome_yaml_parser/src/parser/flow.rscrates/biome_yaml_parser/src/parser/block.rs
📚 Learning: 2025-11-24T18:06:03.545Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_parser/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:06:03.545Z
Learning: Applies to crates/biome_parser/**/src/**/*.rs : Implement error recovery in list parsing using `or_recover()` to wrap unparseable tokens in a `BOGUS_*` node and consume tokens until a recovery token is found
Applied to files:
crates/biome_yaml_parser/src/parser/flow.rscrates/biome_yaml_parser/src/parser/block.rs
📚 Learning: 2025-11-24T18:06:03.545Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_parser/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:06:03.545Z
Learning: Applies to crates/biome_parser/**/src/**/*.rs : Use `p.eat(token)` for optional tokens, `p.expect(token)` for required tokens, `parse_rule(p).ok(p)` for optional nodes, and `parse_rule(p).or_add_diagnostic(p, error)` for required nodes
Applied to files:
crates/biome_yaml_parser/src/parser/flow.rscrates/biome_yaml_parser/src/parser/block.rs
📚 Learning: 2025-11-24T18:06:03.545Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_parser/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:06:03.545Z
Learning: Applies to crates/biome_parser/**/src/**/*.rs : Parse rules must return `ParsedSyntax::Absent` if the rule can't predict by the next token(s) if they form the expected node, and must not progress the parser in this case
Applied to files:
crates/biome_yaml_parser/src/parser/flow.rscrates/biome_yaml_parser/src/parser/block.rs
📚 Learning: 2025-11-24T18:06:03.545Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_parser/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:06:03.545Z
Learning: Applies to crates/biome_parser/**/src/**/*.rs : Use `ConditionalParsedSyntax` for syntax that is only valid in specific contexts (e.g., strict mode, file types, language versions) and call `or_invalid_to_bogus()` to convert to a bogus node if not supported
Applied to files:
crates/biome_yaml_parser/src/parser/flow.rscrates/biome_yaml_parser/src/parser/block.rs
📚 Learning: 2025-12-19T12:53:30.399Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2025-12-19T12:53:30.399Z
Learning: Applies to crates/biome_analyze/**/*analyze/src/**/*.rs : Use `full_options` code block property for complete biome.json configuration snippets in documentation
Applied to files:
crates/biome_yaml_parser/src/parser/block.rs
📚 Learning: 2025-12-19T12:53:30.399Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2025-12-19T12:53:30.399Z
Learning: Applies to crates/biome_analyze/**/*analyze/src/**/*.rs : Use `options` code block property for rule-specific configuration snippets in documentation
Applied to files:
crates/biome_yaml_parser/src/parser/block.rs
📚 Learning: 2025-12-19T12:53:30.399Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2025-12-19T12:53:30.399Z
Learning: Applies to crates/biome_analyze/**/*analyze/src/**/*.rs : Rule documentation code blocks should be ordered as language, expect_diagnostic, options/full_options/use_options, ignore, file
Applied to files:
crates/biome_yaml_parser/src/parser/block.rs
📚 Learning: 2025-12-19T12:53:30.399Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2025-12-19T12:53:30.399Z
Learning: Applies to crates/biome_analyze/**/*analyze/src/**/*.rs : Use `use_options` code block property for code examples that follow an options configuration in documentation
Applied to files:
crates/biome_yaml_parser/src/parser/block.rs
🧬 Code graph analysis (1)
crates/biome_yaml_parser/src/parser/flow.rs (1)
crates/biome_yaml_parser/src/parser/parse_error.rs (4)
expected_flow_mapping_closing_quote(30-32)expected_flow_mapping_entry(26-28)expected_flow_sequence_closing_bracket(34-36)expected_flow_sequence_entry(22-24)
🔇 Additional comments (9)
xtask/codegen/src/yaml_kinds_src.rs (1)
62-62: LGTM!The new node kinds
YAML_BLOCK_IN_BLOCK_NODEandYAML_PROPERTY_LISTare well-placed and align with the grammar restructuring outlined in the PR objectives.Also applies to: 85-85
xtask/codegen/yaml.ungram (2)
158-167: Clear structural improvement.The new
YamlBlockInBlockNodewith explicitpropertiesandcontentfields, plus the explanatory comment about preventing ambiguities with mapping keys' properties, makes the design intent clear.
284-286: Property list simplification looks good.The unbounded
AnyYamlProperty*is a sensible choice for parser simplicity. As noted in the PR description, enforcing uniqueness/cardinality constraints via a lint rule is a reasonable approach.crates/biome_yaml_parser/src/parser/block.rs (3)
149-174: Clean property-aware parsing logic.The
undo_completionpattern for cleanup when neither YAML nor JSON node is found is correctly implemented. Good use ofprecede(p)to wrap the parsed nodes.
347-405: Well-structured property parsing infrastructure.The
PropertyList, property parsing helpers, and predicate functions are clean and follow the established patterns in the codebase.
370-380: Consider whether the empty recovery token set is intentional.The recovery uses
token_set![], which differs from other recovery patterns in this file (e.g., line 315 usestoken_set![BLOCK_CONTENT_LITERAL], document.rs usestoken_set![EOF]andtoken_set![DIRECTIVE_LITERAL]). Given thatPropertyListnaturally terminates viais_at_list_end()checking for properties, this might be intentional; however, for consistency and to ensure the parser behaves as expected, either add appropriate recovery tokens or explicitly document why the empty set is correct here.crates/biome_yaml_parser/src/parser/flow.rs (3)
134-159: Solid property-aware sequence entry parsing.The branching logic correctly handles YAML nodes, JSON nodes, and error cases with proper cleanup via
undo_completion.
271-299: Comprehensive implicit entry handling.The three-way branching (YAML node, JSON node, colon-only) covers all valid implicit entry forms. The edge case of a lone colon creating an entry with empty key is correctly handled.
33-36: Verified: all call sites updated.Both
parse_flow_json_nodeandparse_flow_yaml_nodeare correctly called with theproperty_listparameter across flow.rs (8 call sites) and block.rs (2 call sites). No broken callers remain.
| if p.at(MAPPING_START) { | ||
| Present(parse_block_mapping(p)) | ||
| parse_block_mapping(p); | ||
| } else if p.at(SEQUENCE_START) { | ||
| Present(parse_block_sequence(p)) | ||
| } else if p.at(FLOW_START) { | ||
| Present(parse_flow_in_block_node(p)) | ||
| parse_block_sequence(p); | ||
| } else if p.at(T![|]) { | ||
| Present(parse_literal_scalar(p)) | ||
| parse_literal_scalar(p); | ||
| } else if p.at(T![>]) { | ||
| Present(parse_folded_scalar(p)) | ||
| } else { | ||
| Absent | ||
| parse_folded_scalar(p); | ||
| } | ||
|
|
||
| Present(m.complete(p, YAML_BLOCK_IN_BLOCK_NODE)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing content handling may produce incomplete nodes.
If none of the block content conditions match (lines 37-44), the function still completes with YAML_BLOCK_IN_BLOCK_NODE containing only properties but no content. According to the grammar, AnyYamlBlockInBlockContent is required (not optional). Consider adding error handling or returning Absent when no valid content follows the properties.
🔎 Suggested approach
} else if p.at(T![>]) {
parse_folded_scalar(p);
+ } else {
+ // No valid block content found after properties
+ // Either emit an error or abandon and return Absent
+ p.error(/* appropriate error */);
}
Present(m.complete(p, YAML_BLOCK_IN_BLOCK_NODE))Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair. Since this PR focus mostly on restructuring the properties, let me handle this when lexer is ready, given that we currently have no test to verify cases where properties exist but content is empty
| pub(crate) fn parse_flow_yaml_node( | ||
| p: &mut YamlParser, | ||
| property_list: CompletedMarker, | ||
| ) -> CompletedMarker { | ||
| let m = property_list.precede(p); | ||
| parse_plain_scalar(p); | ||
| m.complete(p, YAML_FLOW_YAML_NODE) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unconditional parse_plain_scalar call may fail unexpectedly.
parse_flow_yaml_node calls parse_plain_scalar(p) unconditionally (line 57), but the scalar content is optional per grammar (content: YamlPlainScalar?). When is_at_flow_yaml_node returns false but !property_empty is true (line 25), this function is called without a PLAIN_LITERAL token present. The p.bump(PLAIN_LITERAL) in parse_plain_scalar (line 63) would then consume an incorrect token.
🔎 Suggested fix
pub(crate) fn parse_flow_yaml_node(
p: &mut YamlParser,
property_list: CompletedMarker,
) -> CompletedMarker {
let m = property_list.precede(p);
- parse_plain_scalar(p);
+ if is_at_plain_scalar(p) {
+ parse_plain_scalar(p);
+ }
m.complete(p, YAML_FLOW_YAML_NODE)
}Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In crates/biome_yaml_parser/src/parser/flow.rs around lines 52 to 59,
parse_flow_yaml_node unconditionally calls parse_plain_scalar even though the
scalar is optional; change it to only invoke parse_plain_scalar when the next
token is actually a PLAIN_LITERAL (or when is_at_flow_yaml_node would accept a
plain scalar) — e.g. check p.at(PLAIN_LITERAL) (or equivalent) before calling
parse_plain_scalar so you don't bump an incorrect token; alternatively make
parse_plain_scalar return a bool/Option and only call/handle it conditionally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, this will be addressed in the next pr to allow the lexer to emit properties token
Summary
Refactor YAML properties:
YamlBlockInBlockNode, which allows us to put the properties outside of the block markers, to prevent ambiguities between yaml block map's properties and yaml mapping key's properties.AnyYamlPropertiesCombinationtoYamlPropertyList, which doesn't enforce that there can only be at most 2 properties and they must be different, but allows easier parsing. We can enforce it later in a syntax lint ruleGetting the YAML lexer to emit property tokens will be done in a different PR
Test Plan
As the lexer has yet been able to lex properties, aside from the change in AST due to the introduction of a new block node, everything else should stay the same.
Docs
N/A