Skip to content

Merge @handlebars/parser into @glimmer/syntax#21308

Draft
NullVoxPopuli-ai-agent wants to merge 21 commits intoemberjs:mainfrom
NullVoxPopuli-ai-agent:merge-handlebars-parser-into-glimmer-syntax
Draft

Merge @handlebars/parser into @glimmer/syntax#21308
NullVoxPopuli-ai-agent wants to merge 21 commits intoemberjs:mainfrom
NullVoxPopuli-ai-agent:merge-handlebars-parser-into-glimmer-syntax

Conversation

@NullVoxPopuli-ai-agent
Copy link
Copy Markdown
Contributor

Summary

  • Inlines the private @handlebars/parser package into @glimmer/syntax as lib/hbs-parser/, eliminating it as a separate workspace package
  • Removes Handlebars features that Glimmer never supported (partials, decorators) from the parser helpers, visitor, and whitespace-control — these now throw at parse time instead of in the Glimmer visitor phase
  • Removes the printer.js module (unused by Glimmer)
  • Fixes the this property on PathExpression in preparePath() so downstream code can use path.this directly instead of re-deriving it via regex from original
  • Removes the UpstreamProgram/UpstreamBlockStatement workaround types and the abstract Decorator/Partial* visitor methods
  • Updates build infrastructure (rollup, eslint, CI, workspace config)

Net change: -1,927 lines (216 added, 2,143 removed)

Motivation

@handlebars/parser was a private, internal-only package. Since Glimmer is not Handlebars, there were several layers of hacks:

  1. Type workarounds: UpstreamProgram with optional loc to work around parser bugs
  2. Dead code in visitors: PartialStatement, PartialBlockStatement, Decorator, DecoratorBlock methods that just threw errors
  3. this regex hack: Re-deriving thisHead from the original string via regex instead of using the parser's own this property (which was incorrectly computed)
  4. repairBlock function: Patching missing loc on Program nodes returned by the parser

By owning the parser code directly, we can fix bugs at the source and remove the downstream workarounds.

Test plan

  • pnpm build:js succeeds
  • pnpm tsc --noEmit passes (only pre-existing errors remain)
  • pnpm test:node — all 20 tests pass
  • Full CI suite

🤖 Generated with Claude Code

NullVoxPopuli and others added 4 commits April 8, 2026 22:08
The @handlebars/parser package was a private, internal dependency used
only by @glimmer/syntax. This commit inlines it as lib/hbs-parser/
within @glimmer/syntax and simplifies the implementation:

- Removed Handlebars features Glimmer never supported (partials,
  decorators) from the parser helpers, visitor, and whitespace control.
  These now throw at parse time rather than in the Glimmer visitor phase.
- Removed the printer.js (unused by Glimmer)
- Fixed the `this` property on PathExpression in preparePath() so
  downstream code can use `path.this` directly instead of re-deriving
  it via regex from `original`
- Removed the UpstreamProgram/UpstreamBlockStatement workaround types
  from handlebars-ast.ts
- Removed the abstract Decorator/Partial* visitor methods from Parser
- Removed the standalone @handlebars/parser workspace package
- Updated build infrastructure (rollup, eslint, CI, workspace config)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Jison grammar still produces PartialStatement, Decorator, etc.
nodes, so the base visitor and whitespace control need stubs for them.
The Glimmer visitors continue to throw meaningful errors for these
unsupported features.

Fixes:
- Restore visitor stubs for Partial*/Decorator* in hbs-parser
- Restore error-throwing handlers in HandlebarsNodeVisitors
- Restore type definitions for these nodes in handlebars-ast.ts
- Make Program.loc optional (matches parser reality, replaces old
  UpstreamProgram workaround)
- Fix eslint: unused var in exception.js, import/namespace in parse.js
- Fix prettier formatting in whitespace-control.js

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Now that we own the parser code, remove Handlebars features and
legacy code that Glimmer never uses:

- Reject decorators and partial blocks at parse time in helpers.js
  instead of creating AST nodes that get rejected later. This lets us
  remove Decorator/DecoratorBlock/PartialBlockStatement from the
  visitor, whitespace-control, type definitions, and Glimmer visitors.
- Remove `depth` from PathExpression (tracked `../` context changes
  that Glimmer always rejects — never read by any Glimmer code)
- Remove unused syntax extension points (square/hash options in
  parse.js) that Glimmer never passes
- Simplify exception.js — remove IE/old-Safari compat hacks
  (Object.defineProperty check, error property copying loop)
- Remove partial-specific indent handling from whitespace-control
  (dead code since Glimmer rejects all partials)
- Make visitor.accept() skip unknown node types instead of throwing,
  so PartialStatement (produced inline by the grammar) passes through
  whitespace control to the Glimmer visitor which gives a proper error

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@@ -0,0 +1,25 @@
function Exception(message, node) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not subclass error?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this? can we parse correctly the first time?

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@NullVoxPopuli-ai-agent
Copy link
Copy Markdown
Contributor Author

Addressing review comments:

exception.js — "why not subclass error?"
Good call — converted to class Exception extends Error in 32bfd3c.

handlebars-node-visitors.ts — "why do we need this? can we parse correctly the first time?"
This file exists because the Jison-generated parser produces a flat Handlebars AST (a list of statements: MustacheStatement, ContentStatement, BlockStatement, etc.), but Glimmer needs an HTML-aware AST with ElementNode, attributes, modifiers, etc. The two can't be unified into one pass because the HTML tokenizer (simple-html-tokenizer) and the Handlebars parser operate on different levels — ContentStatement nodes contain raw HTML text that gets fed back into the HTML tokenizer via tokenizePart().

The long-term fix would be replacing the Jison parser entirely with a single-pass Glimmer-native parser that handles both HTML and expression syntax together. This PR is a stepping stone toward that — by inlining the parser we can now iteratively replace pieces of it.

@NullVoxPopuli
Copy link
Copy Markdown
Contributor

@NullVoxPopuli-ai-agent so do that replacement

@NullVoxPopuli-ai-agent
Copy link
Copy Markdown
Contributor Author

Starting work on replacing the Jison parser with a single-pass Glimmer-native parser. Plan:

  1. Expression parser (recursive-descent, ~300-400 lines) — replaces the 2032-line Jison-generated parser for {{...}} expression parsing
  2. GlimmerTokenizer — scans the source character-by-character, delegates HTML to simple-html-tokenizer, and invokes the expression parser when it hits {{
  3. Inline whitespace control — standalone block stripping happens during scanning instead of as a post-processing pass
  4. Delete hbs-parser/ — the entire Jison parser, visitor, whitespace-control, and helpers

Working on it now.

NullVoxPopuli and others added 5 commits April 9, 2026 00:25
Replace the 2032-line Jison-generated parser (parser.js), 217-line
helpers (helpers.js), and grammar source files (handlebars.yy,
handlebars.l) with a single 1583-line recursive-descent parser
(rd-parser.js).

The new parser:
- Produces the same HBS AST as the Jison parser, so the rest of the
  pipeline (WhitespaceControl, HandlebarsNodeVisitors) works unchanged
- Handles all expression types: paths, sub-expressions, hash pairs,
  literals, block params, strip flags, comments, raw blocks
- Rejects decorators and partial blocks at parse time with clear errors
- Is readable, debuggable JS instead of an opaque state machine
- Removes the need for Jison as a build tool

Net change: -1,040 lines

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
parseProgram() was breaking out of its statement loop when it
encountered `{{`, causing all mustache statements and content after
them to be lost. For `<h1>{{title}}</h1>`, only `<h1>` was parsed,
leading to "Unclosed element" errors.

The fix removes the break so parseStatement() handles both content
and mustache cases as intended.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
These errors are now thrown during parsing (in rd-parser.js) rather
than in the Glimmer visitor layer, so they don't have the full
SyntaxError format with source context. Update tests to use regex
matchers instead of exact SyntaxError format.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Handle \{{ as literal content (escaped mustache), matching the
  Jison lexer's 'emu' state behavior
- Fix {{~{ not being recognized as unescaped open when strip flag
  is present
- Guard parseProgramBody against escaped mustaches in block bodies

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Allow whitespace around = in hash pairs (e.g., key = "value")
- Allow digit-starting path segments after separators (e.g.,
  array.2.[@#].[1]) — digits are valid ID chars but readID()
  rejected them at top level to avoid ambiguity with NumberLiteral

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- 'packages/@glimmer/*'
- 'packages/@glimmer-workspace/*'
- 'packages/@handlebars/*'
# @handlebars/parser has been merged into @glimmer/syntax
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't have comments like this. remove it

'@glimmer/**',

// @handlebars/parser is a hidden dependency, not an explicit entrypoint
// @handlebars/parser has been merged into @glimmer/syntax
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this, we'll no longer have handlebars after this PR

NullVoxPopuli and others added 2 commits April 9, 2026 08:54
Handle consecutive backslashes before {{ correctly:
- \{{ → escaped mustache (emit {{ as content)
- \\{{ → literal \ then real mustache
- \\\{{ → literal \ then escaped mustache

The Jison lexer handled this with separate states (emu for escaped
mustache). The rd-parser now counts consecutive backslashes and
applies the same even/odd logic.

Also fix hash pair parsing to allow whitespace around = signs,
and allow digit-starting path segments after separators.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@0, @1 etc. must remain parse errors (the Jison lexer matched NUMBER
before ID for digit-starting tokens). Only allow digit-starting
segments AFTER a dot/slash separator, e.g. array.2.[foo].

Also fix prettier formatting in test file.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@NullVoxPopuli
Copy link
Copy Markdown
Contributor

📊 Package size report   -0.91%↓

File Before (Size / Brotli) After (Size / Brotli)
dist/dev/packages/shared-chunks/compiler-DO0O9Kiz.js 177.3 kB / 33.8 kB 23%↑217.6 kB / 20%↑40.5 kB
dist/dev/packages/shared-chunks/transform-resolutions-DXsgdc6q.js 188.6 kB / 38.1 kB -33.2%↓126.1 kB / -30.7%↓26.4 kB
dist/prod/packages/shared-chunks/compiler-bbkrtYN0.js 190.5 kB / 36.1 kB 21%↑230.9 kB / 19%↑42.9 kB
dist/prod/packages/shared-chunks/transform-resolutions-CbitE0O-.js 174.2 kB / 35.3 kB -35.9%↓111.6 kB / -33%↓23.7 kB
types/stable/@glimmer/syntax/lib/hbs-parser/exception.d.ts 262 B / 148 B
types/stable/@glimmer/syntax/lib/hbs-parser/index.d.ts 144 B / 102 B
types/stable/@glimmer/syntax/lib/hbs-parser/parse.d.ts 190 B / 112 B
types/stable/@glimmer/syntax/lib/hbs-parser/rd-parser.d.ts 203 B / 129 B
types/stable/@glimmer/syntax/lib/hbs-parser/whitespace-control.d.ts 674 B / 238 B
types/stable/@glimmer/syntax/lib/parser.d.ts 5.1 kB / 1.1 kB -5.23%↓4.8 kB / -2.99%↓1 kB
types/stable/@glimmer/syntax/lib/parser/handlebars-node-visitors.d.ts 2.3 kB / 640 B -9.25%↓2.1 kB / -6.87%↓596 B
types/stable/@glimmer/syntax/lib/v1/handlebars-ast.d.ts 7.2 kB / 1.2 kB -18.3%↓5.9 kB / -7.08%↓1.1 kB
types/stable/@handlebars/parser/types/ast.d.ts 3.6 kB / 614 B
types/stable/@handlebars/parser/types/index.d.ts 400 B / 183 B
types/stable/index.d.ts 42.8 kB / 4 kB 0.6%↑43.1 kB / -0.07%↓4 kB
Total (Includes all files) 5.4 MB / 1.3 MB -0.91%↓5.3 MB / -0.79%↓1.3 MB
Tarball size 1.2 MB -1.01%↓1.2 MB

🤖 This report was automatically generated by pkg-size-action

NullVoxPopuli and others added 8 commits April 9, 2026 09:14
ContentStatement.original must contain the raw source text (including
backslash escape sequences) for round-tripping and whitespace control.
Previously it was set equal to the escape-processed `value`, causing
prettier to lose backslashes when reprinting templates with escaped
mustaches like \\\{{foo}}.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Jison lexer produced separate CONTENT tokens for text before and
after escaped mustaches (\{{), because the escape triggered a state
transition (from INITIAL to emu). This created separate ContentStatement
nodes, which matters for prettier's formatting — it treats each
TextNode independently for line-breaking decisions.

The rd-parser now matches this behavior: when it encounters an odd
number of backslashes before {{, it ends the current ContentStatement
and starts a new one for the escaped {{ content.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Jison emu state stopped scanning at {{, \{{, and \\{{ boundaries.
The rd-parser was only stopping at {{ and \{{, causing escaped mustache
content to merge with subsequent text across backslash-mustache
boundaries. This produced different TextNode splits than the old parser,
causing prettier formatting differences.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prettier's glimmer plugin parses the error message to extract the
position and display message. It expects the Jison format:

  Parse error on line N:
  <source line>
  ----^
  Expecting 'TOKEN', ..., got 'TOKEN'

Also sets error.hash.loc for prettier's getErrorLocation() to extract
the line/column position.

- Empty mustaches ({{}}), strip-only ({{~}}, {{~~}}) report 'CLOSE'
- Invalid characters (single }) report 'INVALID'
- Missing expressions report Jison-compatible token lists

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three fixes:

1. Backslash handling: Jison's strip(0,1) just removed the last
   backslash, keeping all others verbatim. The rd-parser was
   incorrectly collapsing \\ pairs into single \. Now matches
   Jison: \\\{{ emits 2 backslashes (not 1) then escaped {{.

2. Error source display: Jison showed all input lines up to the
   error line joined together (no newlines). The rd-parser was
   showing just the error line.

3. consumeClose error: advance past the invalid character before
   reporting, matching Jison's column position behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
For characters like single } that aren't valid expression starts or
closes, report with the full Jison token list (CLOSE_RAW_BLOCK, CLOSE,
etc.) instead of the shorter expression-only list. Capture the column
of the invalid character before advancing past it, matching Jison's
error position.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The checkForInvalidToken guard was catching whitespace characters as
INVALID, breaking the dangling dot test ({{if foo. bar}}) where
the space after the dot is valid whitespace between params.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The checkForInvalidToken guard was too aggressive — it caught .
(dot) and [ (bracket) as invalid characters. These are valid starts
for path expressions (e.g., the dangling dot in {{if foo. bar}}
which should reach the Glimmer visitor for a proper SyntaxError).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
return program;
};

WhitespaceControl.prototype.BlockStatement = function (block) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not do this prototype way of defining things. we have classes now

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we not need these types anymore? does Glimmer define them somewhere?

return true;
}

function isWhitespace(ch) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not all possible whitespace characters. We need to support UTF-16.

Is there a different parser tool we should use for generating a syntax parser?

Seems kind of silly to re-invent all this, since there are no doubt dozens/hundreds of subtle use cases not covered here

- Remove @handlebars comment from pnpm-workspace.yaml
- Remove @handlebars/** exclusion from rollup.config.mjs
- Convert visitor.js and whitespace-control.js to ES class syntax
- Use Unicode-aware whitespace check (/^\s$/u) in rd-parser
- Remove exception.js (replaced by this.error() in rd-parser)
- Don't catch . and [ as invalid tokens in expression parsing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@NullVoxPopuli-ai-agent
Copy link
Copy Markdown
Contributor Author

Addressed all review comments in 6a8cd44:

pnpm-workspace.yaml — removed the comment

rollup.config.mjs — removed the @handlebars/** exclusion entirely

whitespace-control.js / visitor.js — converted both from prototype-based to ES class syntax

@handlebars/parser types/ast.d.ts — these types now live in @glimmer/syntax/lib/v1/handlebars-ast.ts (the HBS AST intermediate types used between the parser and Glimmer visitor layer)

rd-parser.js isWhitespace — replaced the manual char checks with /^\s$/u.test(ch) which handles the full Unicode whitespace set. The Jison lexer used \s+ in its mu state rules, so this matches. Re: using an existing parser generator — the Handlebars expression grammar is intentionally simple (paths, literals, sub-expressions, hash pairs). The rd-parser is ~1500 lines of straightforward recursive descent. A parser generator would add a build dependency for the same output. The key win here is that it's readable and debuggable, unlike the Jison state tables.

exception.js — deleted entirely, all error throwing now goes through this.error() on the parser class which produces Jison-compatible error format (needed for prettier snapshot compatibility).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants