fix: preserve exact text for unquoted string values by shreyasbhat0 · Pull Request #62 · toon-format/toon-rust

shreyasbhat0 · 2026-03-30T08:12:09Z

Summary

Fix scanner to track original token text (last_token_text) and exact whitespace counts (last_whitespace_count) during tokenization
Rewrite parse_tabular_field_value() to read complete cell text before type inference, per spec §B.3/§B.4
Fix parse_field_value() and parse_value_with_depth() to use original token text and exact spacing

This aligns the parser with the spec's approach: get complete value text first, then type-infer — rather than eagerly tokenizing then reassembling.

Bugs Fixed

Adjacent multiple spaces not decoded correctly #59: Multiple spaces collapsed (a b → a b)
Unexpected parsing errors on unquoted strings #60: Mixed-type tokens caused parse errors (1 null, a 1 in tabular rows)
Unquoted strings starting with non-string token do not strictly preserve its original content #61: Number formatting lost (1.0 b → 1 b, 1e1 b → 10 b)

Test plan

All 154 lib tests pass (including 1 corrected assertion + 4 new test functions)
All 78 integration/doc tests pass
Spec fixture tests pass
Manual smoke test of each reported reproduction case

The scanner was eagerly tokenizing values (breaking on spaces, parsing numbers) instead of treating them as complete text before type inference, as the spec requires (§B.3, §B.4, §B.5). This caused three bugs: - #59: multiple spaces collapsed (`a b` → `a b`) - #60: mixed-type tokens errored (`1 null`, `a 1` in tabular rows) - #61: number formatting lost (`1.0 b` → `1 b`, `1e1 b` → `10 b`) Scanner changes: - Track `last_whitespace_count` and `last_token_text` through scanning - `read_rest_of_line_with_space_info()` returns exact space count - Add `read_until_delimiter_with_space_info()` for tabular cells Parser changes: - `parse_field_value()`: use original token text and exact space count - `parse_tabular_field_value()`: read complete cell text then type-infer - `parse_value_with_depth()`: handle all value token types in root-level concatenation with exact spacing Fixes #59, #60, #61

Replace loop+match with while-let patterns and remove unnecessary let binding in scan_token.

shreyasbhat0 requested a review from a team as a code owner March 30, 2026 08:12

This was referenced Mar 30, 2026

Adjacent multiple spaces not decoded correctly #59

Closed

Unexpected parsing errors on unquoted strings #60

Closed

Unquoted strings starting with non-string token do not strictly preserve its original content #61

Closed

shreyasbhat0 added 2 commits March 30, 2026 13:44

style: cargo fmt

f4d0088

style: fix clippy warnings

77dc8b4

Replace loop+match with while-let patterns and remove unnecessary let binding in scan_token.

shreyasbhat0 merged commit 1217ed6 into main Mar 30, 2026
3 checks passed

shreyasbhat0 deleted the fix/unquoted-string-parsing branch March 30, 2026 08:18

github-actions bot mentioned this pull request Mar 30, 2026

chore: release v0.4.5 #63

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: preserve exact text for unquoted string values#62

fix: preserve exact text for unquoted string values#62
shreyasbhat0 merged 3 commits intomainfrom
fix/unquoted-string-parsing

shreyasbhat0 commented Mar 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shreyasbhat0 commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Bugs Fixed

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

shreyasbhat0 commented Mar 30, 2026 •

edited

Loading