Nearley dsl by loyaltypollution · Pull Request #69 · source-academy/py-slang

loyaltypollution · 2025-10-29T04:44:17Z

Summary:
This pull request (Fixes #68) introduces unified Nearley + Moo–based parsing system. It replaces the old, fragmented setup (manual tokenizer, static grammar, separate AST DSL) with a single declarative pipeline that integrates tokenization, grammar rules, and AST generation.

Key Improvements:

Adds a Moo lexer for tokens and keywords.
Defines a Nearley grammar aligned with the existing language subset.
Embeds AST node generation within grammar rules.
Maintains compatibility with generate-ast.ts.
Introduces a build step to compile grammar into the parser.

Benefits:

Centralized source of truth for grammar (grammar.ne + lexer.moo).
Automatic derivation of TokenType enums — no manual syncing.
Deprecates tokenizer.ts and Grammar.gram.
Easier debugging through Nearley’s readable parse trees.

Next Steps:

Expand grammar to support more Python constructs (+=, comprehensions, etc.).
Reassess integration with generate-ast.ts -> currently creates ExprNS/StmtNS objects

- Moved error handling logic to a dedicated errors module, improving organization and maintainability.

loyaltypollution · 2025-10-29T18:12:42Z

Just discovered from link that the code is slow:

main -> statement:* {% flatten %}

Turns out that instead of the post-processor function being executed when all statements are matched it gets executed at every increment:

with 0 statements
with 1 statement
with 2 statements

This was benchmarking result of simple ackermann function

In order to ensure the Nearley parser emitted the current internal Python AST , we created these post-processor functions. However they are slow. A move to Nearley parser might require rethinking the internal Python AST

This merge integrates the WASM compiler from main while preserving the new Nearley-based parser from NearleyDSL. Key changes: - Adapted pyRunner.ts to use new parser (Tokenizer + Parser + Resolver) - Updated AST type definitions to avoid duplication - Fixed resolver to use new validators subsystem - Merged package dependencies for both parser and WASM compiler - Removed old translator and parser error handling (superseded by Nearley) - Updated test utilities to work with new parser architecture - Kept WASM compiler (now uses new AST types from py-slang parser) The CSE machine already expected the new AST types, so no changes were needed there. Both the WASM compiler and CSE machine now consume the unified AST generated by the new Nearley parser. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

- Fixed parser imports: use NearleyParser from parser-adapter - Fixed Resolver constructor calls to match new signature (source, ast) - Fixed WASM compiler imports and complex number parsing - Fixed PyWasmEvaluator result type conversion - Ensured all compilers work with new unified AST types Build now completes successfully.

- Use NearleyParser from parser-adapter instead of non-existent Parser module - Fix Resolver constructor to pass AST instead of chapter number Parser and analysis tests now pass successfully.

Changed parser grammar to create BigIntLiteral nodes for integer literals instead of Literal nodes with parsed numbers. This ensures integers stay as bigint type throughout arithmetic operations instead of being converted to float. Now 23+1 returns 24 (bigint) instead of 24.0 (float).

Tests now expect BigIntLiteral for integer literals instead of Literal, reflecting the grammar changes to properly preserve integer types.

The issue was that we were creating a Tokenizer separately and passing tokens to the Nearley parser, which has its own integrated lexer and ignores the tokens parameter. Reverted to using the parse() function from parser-adapter, which matches the original implementation and allows Nearley to use its integrated lexer correctly for handling indentation and nested structures. This fixes the 'Unexpected token: else' error in nested if statements.

loyaltypollution added 3 commits October 28, 2025 20:25

Refactor error handling and update imports

148b56d

- Moved error handling logic to a dedicated errors module, improving organization and maintainability.

Implemented Nearley parsing

0b277b4

Fix bigint generation

26474de

loyaltypollution and others added 16 commits October 30, 2025 02:13

Remove type:module in package.json

67df345

Nearly advancement

00f21b8

Merge branch 'main' into NearleyDSL

30aeab4

Merge remote-tracking branch 'origin/NearleyDSL' into NearleyDSL

4084b0b

implement ir resolver for each sublanguage

a2085c3

Ignore mac ds store

b338cfc

implemented working nearley parser

36f5eab

fix py tests

88df88f

Subsume validators into resolver

7c824f1

Fix test utils imports and Resolver usage

ffb10f7

- Use NearleyParser from parser-adapter instead of non-existent Parser module - Fix Resolver constructor to pass AST instead of chapter number Parser and analysis tests now pass successfully.

Update tests for BigIntLiteral in parser

82a1fbb

Tests now expect BigIntLiteral for integer literals instead of Literal, reflecting the grammar changes to properly preserve integer types.

Fix dedent,indent bug

9be4774

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nearley dsl#69

Nearley dsl#69
loyaltypollution wants to merge 19 commits intosource-academy:mainfrom
loyaltypollution:NearleyDSL

loyaltypollution commented Oct 29, 2025

Uh oh!

loyaltypollution commented Oct 29, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

loyaltypollution commented Oct 29, 2025

Uh oh!

loyaltypollution commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

loyaltypollution commented Oct 29, 2025 •

edited

Loading