Skip to content

Thomasgorissen/mlir and jl4-service speed-up#890

Merged
serrynaimo merged 9 commits intomainfrom
thomasgorissen/mlir
Apr 13, 2026
Merged

Thomasgorissen/mlir and jl4-service speed-up#890
serrynaimo merged 9 commits intomainfrom
thomasgorissen/mlir

Conversation

@serrynaimo
Copy link
Copy Markdown
Collaborator

New MLIR compiler and 400x jl4-service evaluation speed-up

upload-artifact v5→v6, download-artifact v4→v5,
softprops/action-gh-release v1→v2 — fixes the Node.js 20
deprecation warnings before the June 2026 deadline.
New package that lowers the typechecked L4 AST (Module Resolved) to
MLIR textual IR and drives the standard mlir-opt / mlir-translate /
llc / wasm-ld toolchain to produce .wasm binaries with JSON schema
sidecars compatible with jl4-service's FunctionSchema.

Highlights:
- Uniform f64 ABI across all L4 values (boxed ints/bools/pointers)
- Proper closure conversion (lambda-lifting) for WHERE and LET IN
- Arity-based name mangling for L4 overloads
- Auto-synthesized externs for unresolved prelude builtins
- CLI (jl4-mlir) with wasm / run / list subcommands, JSON-in/JSON-out
- Batch-compile script scans the repo and reports compile status

Batch results on the current tree: 379/431 typecheck ok, 0 MLIR
compile failures across the successfully typechecked set.
The Node-based runtime had several correctness gaps that caused wasm
outputs to diverge from jl4-service's /evaluation responses:

- Pointer args crossed the f64 ABI as numeric doubles, not as the
  matching u64 bit pattern the compiler's bitcast expects. Any
  function taking a struct / string / list produced garbage.
- Optional record fields (MAYBE NUMBER, MAYBE STRING, ...) were sent
  as flat values instead of a 2-slot {tag, payload} record.
- Enum fields inside structs were stored as u64 bit patterns, but the
  compiler expects numeric f64 for enum-tag comparisons.
- STRING and MAYBE return values were handed back raw; strings were
  never read from linear memory and MAYBE never decoded to JUST/NOTHING.
- The final JSON lacked the {tag:"SimpleResponse", contents:...}
  envelope that the HTTP service wraps responses in.

After this change 11 of 12 exports in the auth-proxy validation
fixture produce byte-identical JSON to the service; the remaining
one (order-total) differs only by a 1-ULP floating-point accumulation
rounding (510.44999999999993 vs 510.45).
Describes the MLIR/WASM compiler backend — build steps, CLI, architecture,
the uniform f64 ABI, memory layout, scope limits, and a benchmark summary
against jl4-service (~900x faster, ~40% less RAM on the auth-proxy fixture).
The compiler now round-trips date/time values through the same JSON
wire format as jl4-service. On the auth-proxy datetime probe fixture
all six exports produce byte-identical responses.

Key changes:
- Runtime builtins for the full DATE / TIME / DATETIME primitive set
  (DATE_FROM_DMY, TIME_FROM_HMS, DATETIME_FROM_DTZ, accessors, parsers).
- Node runner: u64<->f64 reinterpret cast, ISO string parsing for
  'format: date' / 'format: time' schema fields, proper formatting for
  DATE / TIME / DATETIME returns, IANA timezone conversion via Intl,
  Proxy-wrapped imports so unused externs don't break instantiate.
- Lower.hs lowers dependency 'Decide' bodies (skipping decls that need
  higher-order support), intercepts all DATE_* / TIME_* / DATETIME_*
  primitives as runtime calls, adds FLOOR / CEILING / ROUND / ABS /
  POW / MIN / MAX intrinsics, compiles TIMEZONE IS <expr> into a
  zero-arg TIMEZONE() function.
- Emit.hs: global strings get a trailing NUL so readCString can walk
  to the terminator instead of running into the next global.
- wasm-server.mjs: standalone HTTP wrapper mirroring jl4-service.
- batch-compile.sh: cleans up per-file build artefacts after the run.
- test/fixtures/datetime-probe.l4: datetime regression fixture.
- dist-wasm/ added to .gitignore.
Rounds out intrinsic coverage against jl4-core's builtin list:
Math (SQRT LN LOG10 SIN COS TAN ASIN ACOS ATAN EXPONENT TRUNC IS INTEGER),
Strings (STRINGLENGTH TOUPPER TOLOWER TRIM CONTAINS STARTSWITH ENDSWITH
INDEXOF CHARAT SUBSTRING REPLACE), TOSTRING / TONUMBER, nullary temporal
(TODAY NOW CURRENTTIME), JSONENCODE / JSONDECODE.

Everything is host-provided (JS in the CLI runner + wasm-server.mjs).
Verified against jl4-service on a new intrinsics-probe fixture:
sqrt/ln/toupper/contains/stringlength/replace/is-integer all return
byte-identical SimpleResponse payloads.

- Runtime.Builtins: extern declarations for every new name.
- Lower.hs: route each uppercase intrinsic to its __l4_* runtime call.
  TODAY / NOW / CURRENTTIME handled in the zero-arg App case.
- app/Main.hs + scripts/wasm-server.mjs: JS implementations honouring
  the uniform f64 ABI (u64<->f64 reinterpret at the string pointer
  boundary, mbJustF64 / mbNothing for MAYBE returns).
- test/fixtures/intrinsics-probe.l4: regression fixture.
The ~450 lines of ABI marshaling + intrinsic JS were duplicated across
app/Main.hs's generateNodeRunner template and scripts/wasm-server.mjs.
Consolidated into runtime/jl4-runtime.mjs as a single createRuntime()
factory.

- runtime/jl4-runtime.mjs: authoritative runtime — u64<->f64 reinterpret,
  allocator, DATE / TIME / DATETIME helpers, marshaling, unmarshaling,
  Proxy-wrapped env import table, invokeFunction.
- scripts/wasm-server.mjs: drops 400+ lines, just imports createRuntime.
  Output key order matches Aeson (alphabetical) so responses stay
  byte-identical to jl4-service.
- app/Main.hs: embeds runtime/jl4-runtime.mjs at compile time via
  file-embed. generateNodeRunner shrinks to a launcher that reads the
  wasm, calls createRuntime, invokes the function. Node now runs with
  --input-type=module so the embedded 'export' is legal.
- jl4-mlir.cabal: adds file-embed dep; declares the runtime as
  extra-source-files so TH re-embeds after edits.

Verified against jl4-service on the datetime and intrinsics probes via
both the CLI and the HTTP wrapper — byte-identical SimpleResponse
payloads. Batch-compile regression: still 0 failures.
Before: any /evaluation call whose arguments included a record, an
enum string, or a MAYBE-typed field triggered the wrapper path —
generate a text snippet '#EVAL foo(JSONDECODE "...")', concatenate it
onto the original source, and re-run the full Shake pipeline (parse +
typecheck + evaluate) for each request. The auth-proxy bench measured
~160 ms per call on anything taking a record input.

After: evaluateDirectAST handles everything except genuine
FnUncertain / FnUnknown outside a MAYBE slot, and the wrapper fallback
disappears for non-deontic functions. Same fixture now runs ~0.2-0.3
ms per call — 466x to 810x faster per function, 523x faster on the
1200-call aggregate (159 s -> 304 ms).

- fnLiteralToExprTyped: recursive converter that consults a per-call
  'ModuleInfo' (record + enum declarations) to build AST values of the
  expected type: FnObject -> App recordCtorRef [fieldsInDeclOrder],
  FnLitString + enum type -> App variantRef [], missing/null against
  a MAYBE -> NOTHING, anything else against MAYBE -> JUST (recurse).
  ISO 'YYYY-MM-DD' / 'HH:MM:SS' / ISO-8601 datetime strings are parsed
  here and emitted as DATE_FROM_DMY / TIME_FROM_HMS / DATETIME_FROM_DTZ
  calls so the evaluator sees real temporal values.
- buildModuleInfo: walks the typechecked Module Resolved once per call,
  indexing records by type unique (to constructor Resolved + field
  list) and enums by type unique (to variant-name -> Resolved).
- requiresWrapperEvaluation: trimmed to just FnUncertain / FnUnknown
  (and arrays / records containing them). Everything else goes direct.

Verified on 25 probe cases across the auth-proxy test.l4 and the
jl4-mlir datetime / intrinsics fixtures — all responses byte-identical
to the pre-change service output.
@serrynaimo serrynaimo merged commit 023f08a into main Apr 13, 2026
@serrynaimo serrynaimo deleted the thomasgorissen/mlir branch April 13, 2026 01:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant