Skip to content

Conversation

@l0ngvh
Copy link
Contributor

@l0ngvh l0ngvh commented Dec 24, 2025

Summary

Refactor YAML properties:

  • Introduce a new node YamlBlockInBlockNode, which allows us to put the properties outside of the block markers, to prevent ambiguities between yaml block map's properties and yaml mapping key's properties.
  • Change AnyYamlPropertiesCombination to YamlPropertyList, which doesn't enforce that there can only be at most 2 properties and they must be different, but allows easier parsing. We can enforce it later in a syntax lint rule

Getting the YAML lexer to emit property tokens will be done in a different PR

Test Plan

As the lexer has yet been able to lex properties, aside from the change in AST due to the introduction of a new block node, everything else should stay the same.

Docs

N/A

@changeset-bot
Copy link

changeset-bot bot commented Dec 24, 2025

⚠️ No Changeset found

Latest commit: 6a06ffb

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-actions github-actions bot added A-Parser Area: parser A-Tooling Area: internal tools labels Dec 24, 2025
@l0ngvh l0ngvh requested review from a team December 24, 2025 23:00
@l0ngvh l0ngvh marked this pull request as ready for review December 24, 2025 23:00
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 24, 2025

Walkthrough

This PR refactors the YAML parser to introduce a unified PropertyList mechanism for handling anchor and tag properties across both block and flow parsing paths. It consolidates property handling into a single list type (replacing separate anchor-first and tag-first variants), updates block-in-block node structures to explicitly carry properties alongside content, and modifies flow node parsing to accept and propagate property context through the parsing pipeline. Grammar and syntax kind definitions are updated to reflect these structural changes.

Possibly related PRs

Suggested reviewers

  • ematipico
  • dyc3

Pre-merge checks and finishing touches

✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(yaml_parser): restructure yaml property' accurately captures the primary change: restructuring YAML property handling through new AST nodes and types.
Description check ✅ Passed The description clearly explains the refactoring goals (new YamlBlockInBlockNode, YamlPropertyList replacement, deferred lexer work) and relates directly to the changeset.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 25cdd50 and 6a06ffb.

⛔ Files ignored due to path filters (46)
  • crates/biome_yaml_factory/src/generated/node_factory.rs is excluded by !**/generated/**, !**/generated/** and included by **
  • crates/biome_yaml_factory/src/generated/syntax_factory.rs is excluded by !**/generated/**, !**/generated/** and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/extra_characters_in_header.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/invalid_headers.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_followed_plain.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_key_contains_multiple_values.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/mapping_missing_colon.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/multi_line_flow_collection_broken_out_of_parent_block.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/multi_line_plain_token_broken_out_parent_flow.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/plain_followed_mapping.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/plain_separated_by_comments.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/block/sequence_follow_by_map.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/flow/empty_mapping_entry.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/flow/mapping_missing_closing_bracket.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/flow/missing_closing_double_quoted.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/flow/missing_closing_single_quoted.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/flow/plain_start_with_indicator.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/err/flow/sequence_missing_closing_bracket.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/explicit_block_mapping.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/explicit_block_mapping_with_compact_mapping.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/explicit_block_mapping_with_empty_value.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/explicit_block_mapping_with_empty_value_and_mapping_as_key.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/implicit_block_mapping_empty_value.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/nested_implicit_block_mapping.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/simple_block_mapping.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/mapping/simple_block_mapping_separated_by_trivia.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/scalar/folded_scalar.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/scalar/literal_scalar.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/map_nested_in_sequence.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/nested_block_sequence.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/sequence_nested_in_map.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/sequence_with_trailing_trivia.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/block/sequence/simple_block_sequence.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/document/separated_by_doc_end.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/double_quote.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/explicit_mapping.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/flow_with_trailing_trivia.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/nested_mapping.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/simple_mapping.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/simple_plain.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/simple_sequence.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_parser/tests/yaml_test_suite/ok/flow/single_quote.yaml.snap is excluded by !**/*.snap and included by **
  • crates/biome_yaml_syntax/src/generated/kind.rs is excluded by !**/generated/**, !**/generated/** and included by **
  • crates/biome_yaml_syntax/src/generated/macros.rs is excluded by !**/generated/**, !**/generated/** and included by **
  • crates/biome_yaml_syntax/src/generated/nodes.rs is excluded by !**/generated/**, !**/generated/** and included by **
  • crates/biome_yaml_syntax/src/generated/nodes_mut.rs is excluded by !**/generated/**, !**/generated/** and included by **
📒 Files selected for processing (4)
  • crates/biome_yaml_parser/src/parser/block.rs
  • crates/biome_yaml_parser/src/parser/flow.rs
  • xtask/codegen/src/yaml_kinds_src.rs
  • xtask/codegen/yaml.ungram
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (CONTRIBUTING.md)

**/*.rs: Use inline rustdoc documentation for rules, assists, and their options
Use the dbg!() macro for debugging output in Rust tests and code
Use doc tests (doctest) format with code blocks in rustdoc comments; ensure assertions pass in tests

Files:

  • xtask/codegen/src/yaml_kinds_src.rs
  • crates/biome_yaml_parser/src/parser/flow.rs
  • crates/biome_yaml_parser/src/parser/block.rs
🧠 Learnings (9)
📚 Learning: 2025-11-24T18:06:03.545Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_parser/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:06:03.545Z
Learning: Applies to crates/biome_parser/**/src/**/*.rs : Use `ParseSeparatedList` and `ParseNodeList` for parsing lists with error recovery to avoid infinite loops

Applied to files:

  • crates/biome_yaml_parser/src/parser/flow.rs
  • crates/biome_yaml_parser/src/parser/block.rs
📚 Learning: 2025-11-24T18:06:03.545Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_parser/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:06:03.545Z
Learning: Applies to crates/biome_parser/**/src/**/*.rs : Implement error recovery in list parsing using `or_recover()` to wrap unparseable tokens in a `BOGUS_*` node and consume tokens until a recovery token is found

Applied to files:

  • crates/biome_yaml_parser/src/parser/flow.rs
  • crates/biome_yaml_parser/src/parser/block.rs
📚 Learning: 2025-11-24T18:06:03.545Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_parser/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:06:03.545Z
Learning: Applies to crates/biome_parser/**/src/**/*.rs : Use `p.eat(token)` for optional tokens, `p.expect(token)` for required tokens, `parse_rule(p).ok(p)` for optional nodes, and `parse_rule(p).or_add_diagnostic(p, error)` for required nodes

Applied to files:

  • crates/biome_yaml_parser/src/parser/flow.rs
  • crates/biome_yaml_parser/src/parser/block.rs
📚 Learning: 2025-11-24T18:06:03.545Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_parser/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:06:03.545Z
Learning: Applies to crates/biome_parser/**/src/**/*.rs : Parse rules must return `ParsedSyntax::Absent` if the rule can't predict by the next token(s) if they form the expected node, and must not progress the parser in this case

Applied to files:

  • crates/biome_yaml_parser/src/parser/flow.rs
  • crates/biome_yaml_parser/src/parser/block.rs
📚 Learning: 2025-11-24T18:06:03.545Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_parser/CONTRIBUTING.md:0-0
Timestamp: 2025-11-24T18:06:03.545Z
Learning: Applies to crates/biome_parser/**/src/**/*.rs : Use `ConditionalParsedSyntax` for syntax that is only valid in specific contexts (e.g., strict mode, file types, language versions) and call `or_invalid_to_bogus()` to convert to a bogus node if not supported

Applied to files:

  • crates/biome_yaml_parser/src/parser/flow.rs
  • crates/biome_yaml_parser/src/parser/block.rs
📚 Learning: 2025-12-19T12:53:30.399Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2025-12-19T12:53:30.399Z
Learning: Applies to crates/biome_analyze/**/*analyze/src/**/*.rs : Use `full_options` code block property for complete biome.json configuration snippets in documentation

Applied to files:

  • crates/biome_yaml_parser/src/parser/block.rs
📚 Learning: 2025-12-19T12:53:30.399Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2025-12-19T12:53:30.399Z
Learning: Applies to crates/biome_analyze/**/*analyze/src/**/*.rs : Use `options` code block property for rule-specific configuration snippets in documentation

Applied to files:

  • crates/biome_yaml_parser/src/parser/block.rs
📚 Learning: 2025-12-19T12:53:30.399Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2025-12-19T12:53:30.399Z
Learning: Applies to crates/biome_analyze/**/*analyze/src/**/*.rs : Rule documentation code blocks should be ordered as language, expect_diagnostic, options/full_options/use_options, ignore, file

Applied to files:

  • crates/biome_yaml_parser/src/parser/block.rs
📚 Learning: 2025-12-19T12:53:30.399Z
Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2025-12-19T12:53:30.399Z
Learning: Applies to crates/biome_analyze/**/*analyze/src/**/*.rs : Use `use_options` code block property for code examples that follow an options configuration in documentation

Applied to files:

  • crates/biome_yaml_parser/src/parser/block.rs
🧬 Code graph analysis (1)
crates/biome_yaml_parser/src/parser/flow.rs (1)
crates/biome_yaml_parser/src/parser/parse_error.rs (4)
  • expected_flow_mapping_closing_quote (30-32)
  • expected_flow_mapping_entry (26-28)
  • expected_flow_sequence_closing_bracket (34-36)
  • expected_flow_sequence_entry (22-24)
🔇 Additional comments (9)
xtask/codegen/src/yaml_kinds_src.rs (1)

62-62: LGTM!

The new node kinds YAML_BLOCK_IN_BLOCK_NODE and YAML_PROPERTY_LIST are well-placed and align with the grammar restructuring outlined in the PR objectives.

Also applies to: 85-85

xtask/codegen/yaml.ungram (2)

158-167: Clear structural improvement.

The new YamlBlockInBlockNode with explicit properties and content fields, plus the explanatory comment about preventing ambiguities with mapping keys' properties, makes the design intent clear.


284-286: Property list simplification looks good.

The unbounded AnyYamlProperty* is a sensible choice for parser simplicity. As noted in the PR description, enforcing uniqueness/cardinality constraints via a lint rule is a reasonable approach.

crates/biome_yaml_parser/src/parser/block.rs (3)

149-174: Clean property-aware parsing logic.

The undo_completion pattern for cleanup when neither YAML nor JSON node is found is correctly implemented. Good use of precede(p) to wrap the parsed nodes.


347-405: Well-structured property parsing infrastructure.

The PropertyList, property parsing helpers, and predicate functions are clean and follow the established patterns in the codebase.


370-380: Consider whether the empty recovery token set is intentional.

The recovery uses token_set![], which differs from other recovery patterns in this file (e.g., line 315 uses token_set![BLOCK_CONTENT_LITERAL], document.rs uses token_set![EOF] and token_set![DIRECTIVE_LITERAL]). Given that PropertyList naturally terminates via is_at_list_end() checking for properties, this might be intentional; however, for consistency and to ensure the parser behaves as expected, either add appropriate recovery tokens or explicitly document why the empty set is correct here.

crates/biome_yaml_parser/src/parser/flow.rs (3)

134-159: Solid property-aware sequence entry parsing.

The branching logic correctly handles YAML nodes, JSON nodes, and error cases with proper cleanup via undo_completion.


271-299: Comprehensive implicit entry handling.

The three-way branching (YAML node, JSON node, colon-only) covers all valid implicit entry forms. The edge case of a lone colon creating an entry with empty key is correctly handled.


33-36: Verified: all call sites updated.

Both parse_flow_json_node and parse_flow_yaml_node are correctly called with the property_list parameter across flow.rs (8 call sites) and block.rs (2 call sites). No broken callers remain.

Comment on lines 37 to +47
if p.at(MAPPING_START) {
Present(parse_block_mapping(p))
parse_block_mapping(p);
} else if p.at(SEQUENCE_START) {
Present(parse_block_sequence(p))
} else if p.at(FLOW_START) {
Present(parse_flow_in_block_node(p))
parse_block_sequence(p);
} else if p.at(T![|]) {
Present(parse_literal_scalar(p))
parse_literal_scalar(p);
} else if p.at(T![>]) {
Present(parse_folded_scalar(p))
} else {
Absent
parse_folded_scalar(p);
}

Present(m.complete(p, YAML_BLOCK_IN_BLOCK_NODE))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing content handling may produce incomplete nodes.

If none of the block content conditions match (lines 37-44), the function still completes with YAML_BLOCK_IN_BLOCK_NODE containing only properties but no content. According to the grammar, AnyYamlBlockInBlockContent is required (not optional). Consider adding error handling or returning Absent when no valid content follows the properties.

🔎 Suggested approach
     } else if p.at(T![>]) {
         parse_folded_scalar(p);
+    } else {
+        // No valid block content found after properties
+        // Either emit an error or abandon and return Absent
+        p.error(/* appropriate error */);
     }
 
     Present(m.complete(p, YAML_BLOCK_IN_BLOCK_NODE))

Committable suggestion skipped: line range outside the PR's diff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair. Since this PR focus mostly on restructuring the properties, let me handle this when lexer is ready, given that we currently have no test to verify cases where properties exist but content is empty

Comment on lines +52 to 59
pub(crate) fn parse_flow_yaml_node(
p: &mut YamlParser,
property_list: CompletedMarker,
) -> CompletedMarker {
let m = property_list.precede(p);
parse_plain_scalar(p);
m.complete(p, YAML_FLOW_YAML_NODE)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Unconditional parse_plain_scalar call may fail unexpectedly.

parse_flow_yaml_node calls parse_plain_scalar(p) unconditionally (line 57), but the scalar content is optional per grammar (content: YamlPlainScalar?). When is_at_flow_yaml_node returns false but !property_empty is true (line 25), this function is called without a PLAIN_LITERAL token present. The p.bump(PLAIN_LITERAL) in parse_plain_scalar (line 63) would then consume an incorrect token.

🔎 Suggested fix
 pub(crate) fn parse_flow_yaml_node(
     p: &mut YamlParser,
     property_list: CompletedMarker,
 ) -> CompletedMarker {
     let m = property_list.precede(p);
-    parse_plain_scalar(p);
+    if is_at_plain_scalar(p) {
+        parse_plain_scalar(p);
+    }
     m.complete(p, YAML_FLOW_YAML_NODE)
 }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In crates/biome_yaml_parser/src/parser/flow.rs around lines 52 to 59,
parse_flow_yaml_node unconditionally calls parse_plain_scalar even though the
scalar is optional; change it to only invoke parse_plain_scalar when the next
token is actually a PLAIN_LITERAL (or when is_at_flow_yaml_node would accept a
plain scalar) — e.g. check p.at(PLAIN_LITERAL) (or equivalent) before calling
parse_plain_scalar so you don't bump an incorrect token; alternatively make
parse_plain_scalar return a bool/Option and only call/handle it conditionally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, this will be addressed in the next pr to allow the lexer to emit properties token

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Parser Area: parser A-Tooling Area: internal tools

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant