Skip to content

fix(desktop): inline-math classifier gaps + display-math/blockquote rendering bugs#4543

Open
lightfront wants to merge 7 commits into
esengine:main-v2from
lightfront:fix/inline-math-classifier-groups
Open

fix(desktop): inline-math classifier gaps + display-math/blockquote rendering bugs#4543
lightfront wants to merge 7 commits into
esengine:main-v2from
lightfront:fix/inline-math-classifier-groups

Conversation

@lightfront

@lightfront lightfront commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Fix: inline-math classifier gaps + display-math/blockquote rendering bugs

Overview

This PR fixes eight rendering bugs in the math pipeline — six in the inline-math classifier (mathClassify.ts) and two in the display-math pre-pass (mathNormalize.ts). All eight produce visible KaTeX failures (red error blocks or raw $...$ text) in LLM chat output.


Classifier fixes (mathClassify.ts)

Three common LaTeX math forms rendered as literal dollar signs because the inline-math classifier rejected them:

1. LaTeX command immediately followed by a digit — $\tfrac12$, $\sqrt2$, $\log3$

// Before (bug): \b is a word boundary, but \tfrac12 has no boundary
// between "tfrac" and "12" — both are word characters.
if (/\\[A-Za-z]+\b/.test(math)) return true;
// After (fix):
if (/\\[A-Za-z]+/.test(math)) return true;

2. Multi-letter group notation — $SO(3,1)$, $SU(2)$, $SL(2)$, $GL(n)$

// Before (bug): only single-letter names accepted (f(x), g(x))
if (/^[A-Za-z]\s*\([^)]{1,80}\)$/.test(math)) return true;
// After (fix): 1–6 letter identifiers
if (/^[A-Za-z]{1,6}\s*\([^)]{1,80}\)$/.test(math)) return true;

3. Permutation cycle notation — $(12)$, $(123)$, $(12)(34)$

// New rule: one or more parenthesised all-digit groups
if (/^(?:\(\d+\))+$/.test(math)) return true;

4. Pure numbers rejected — $0$, $1$, $42$

The classifier rejected all pure numbers (return false) to avoid $5 currency
false-positives. But $0$ (mass eigenvalue), $1$, $42$ are extremely common
in math/physics and rendered as literal dollar signs.

// Before (bug): rejects ALL pure numbers
if (/^\d+(?:\.\d+)?%?$/.test(math)) return false;
// After (fix): currency is almost always written WITHOUT a closing $
// ("costs $5", not "costs $5$"), so the $N$ form almost always means math.
if (/^\d+(?:\.\d+)?%?$/.test(math)) return true;

Unpaired currency ($5 with no closing $) is still correctly treated as
literal text by the $...$ pairing logic, so "costs $5 and $6" is unaffected.

5. Binary operator with signed RHS — $K = -iJ$, $p = +\alpha$

The binary-operator regex required an operand character ([A-Za-z0-9...])
immediately after the operator, but - was not in that class — so = - (operator
followed by a unary sign) failed to match. This rejected common physics
expressions like $K = -iJ$, $a = -b$, $p = +\alpha$.

// Before (bug): K = -iJ fails (no operand right after =)
/[A-Za-z0-9)\]}]\s*[+\-*/=<>]\s*[A-Za-z0-9([{\\]/
// After (fix):  allows an optional sign before the RHS operand
/[A-Za-z0-9)\]}]\s*[+\-*/=<>]\s*[+\-]?\s*[A-Za-z0-9([{\\]/

6. Lone operators — $+$, $-$, $=$, $<$

Single-character operator tokens (+, -, =, <, >, ±, ) matched no
accept rule and were rendered as literal text. These are common in physics prose
("the sign of $+$", "the $<$ relation").

if (/^[+\-=<>±]$/.test(math)) return true;

Display-math fixes (mathNormalize.ts)

Two bugs in the $$...$$ display-math pipeline caused cascading KaTeX errors — particularly inside Markdown blockquotes.

7. Repair-before-extraction ordering — closing $$ split off

The repair regex (which inserts \n\n before glued $$) ran before the display-pair extraction. So it split the closing $$ of well-formed pairs like \end{pmatrix}$$ or > $$x$$, breaking display math.

Fix: extract $$…$$ pairs as a unit first, then repair only remaining orphaned $$. This is the d450aec1 ordering fix applied to current main-v2.

8. Blockquote display-math fence — remark-math upstream limitation

remark-math has a known limitation: multi-line $$...$$ display math inside a Markdown blockquote (>) breaks inline math parsing on subsequent blockquote lines. The closing fence isn't recognised and inline $...$ after the display block gets swallowed.

Fix: when display math is detected inside a blockquote, normalizeMath closes the blockquote before the display math (\n\n$$\n...\n$$\n\n> ) and reopens it after. This puts the math outside the quote, avoiding the remark-math fence bug entirely while preserving the visual blockquote structure.


Verification

Check Result
math-golden suite 215/215 (was 189, +26 regression tests + 10 updated expectations)
Typecheck 0 errors
$\tfrac12$, $\sqrt2$, $\log3$ ✅ now renders
$SO(3,1)$, $SU(2)$, $SL(2)$, $GL(n)$ ✅ now renders
$(12)$, $(123)$, $(12)(34)$ ✅ now renders
$0$, $1$, $42$ ✅ now renders
$K = -iJ$, $p = +\alpha$ ✅ now renders
$+$, $-$, $=$, $<$ ✅ now renders
the apples cost $5 and $6. (currency) ✅ stays literal (no false-positive)
Haar theorem blockquote (display + inline) ✅ 0 errors
Mackey theorem blockquote (display + inline after) ✅ 0 errors
$$\end{pmatrix}$$ (closing brace) ✅ closing $$ preserved
$PATH$, costs $5 (unpaired) ✅ still literal
costs $5$ (closed pair) now renders as math (by design — see §4)

Files changed

File Change
desktop/frontend/src/components/mathClassify.ts Drop \b; broaden function-call rule to 1–6 letters; add cycle-notation rule; accept pure numbers; signed RHS in binary-op rule; lone operators
desktop/frontend/src/components/mathNormalize.ts Extract $$…$$ before repair regex; newline-delimited DM restore; blockquote-aware closing (break quote around display math)
desktop/frontend/src/__tests__/math-golden.test.ts +18 classifier regression tests; 7 display-math test expectations updated for corrected behavior

Commits

  • 792df63c — multi-letter group notation (SO(3,1), SU(2), …) classified as inline math
  • 6c64bc91 — classify LaTeX commands followed by a digit (\tfrac12, \sqrt2) as math
  • (cycle notation — added on this branch)
  • 2b050155 — classify pure numbers ($0$, $1$, $42$) as inline math
  • 7ceeebfa — classify operators with signed RHS ($K = -iJ$) and lone operators ($+$)
  • 6f7ef086 — repair display-math extraction ordering + blockquote $$ closing
  • 84e6ac81 — break blockquote around display math to avoid remark-math fence bug

…ne math

The inline-math classifier's function-call rule only accepted a single
letter before the parentheses (f(x), g(x)), so multi-letter group
notation — SO(3,1), SU(2), SL(2), GL(n), Sp(2n), Spin(n), Diff(M) —
fell through to the prose fallback and rendered as literal dollar signs
instead of going through remark-math/rehype-katex.

Broaden the identifier from one letter to 1-6 letters. The cap plus the
requirement that the whole token sits inside one $...$ span keep prose
parentheticals out.

Adds 9 regression cases to the math-golden suite. 177/177 pass.
The classifier's backslash-command rule used \b after the command name:
  if (/\\[A-Za-z]+\b/.test(math)) return true;

\b is a word boundary, but \tfrac12 / \frac12 / \sqrt2 / \log3 /
\overline3 have no boundary between the name and a trailing digit
('c' and '1' are both word chars), so the regex rejected them and
these common LaTeX forms rendered as literal dollar signs instead of
going through remark-math/rehype-katex.

Drop the \b — a backslash command is a backslash command regardless of
what follows. \alpha, \frac{x}{y}, \cdot 3 (all already passing) are
unaffected; the currency / env-var guards below catch any new
false-positives.

+5 regression cases. 182/182 pass.
@github-actions github-actions Bot added v2 Go rewrite (1.x) — main-v2 branch, active development desktop Wails desktop app (desktop/**) labels Jun 15, 2026
The inline-math classifier accepted comma-separated tuples like (A, B)
or (1, 2, 3), but not permutation cycle notation — (12), (123),
(12)(34) — where digits are packed together without commas. These fell
through every accept rule and returned false, so normalizeMath wrapped
them in &esengine#36; entities and they rendered as literal dollar signs
instead of going through remark-math/rehype-katex.

Add a rule that accepts one or more parenthesised digit groups:
  if (/^(?:\(\d+\))+$/.test(math)) return true;

This is specific enough to avoid false positives: a parenthesised
all-digit group is currency-shaped in prose only when it looks like
(5), which is rare and never written inside $...$. Comma tuples
continue through the existing rule.

+4 regression cases (transposition, 3-cycle, product-of-transpositions,
and an end-to-end normalizeMath check). 207/207 pass.
@lightfront lightfront changed the title fix(desktop): render LaTeX-command+digit and multi-letter group notation as inline math fix(desktop): render LaTeX+digit, group & cycle notation as inline math Jun 16, 2026
…ing delimiters

Two related bugs in the display-math pipeline (mathNormalize.ts):

1. Repair-before-extraction ordering: the repair regex (which inserts
   \n\n before glued $$$) ran BEFORE the display-pair extraction,
   so it split the closing $$ of well-formed pairs like
   ``\end{pmatrix}$$` or `> $$x$$`, breaking display math.
   Fix: extract $$…$$ pairs as a unit FIRST, then repair remaining
   orphaned $$.

2. Blockquote display-math: when $$...$$ appears inside a Markdown
   blockquote (>), remark-math requires the closing $$ to also carry
   the > prefix. Without it, the closing fence isn't recognised and
   remark-math swallows the following paragraph as math content
   (KaTeX error). Fix: the DM-restore step now detects whether the
   opening $$ was inside a blockquote and prefixes the closing $$
   with > accordingly.

Also restores the d450aec newline-delimited DM restore ($$\n…\n$$)
which remark-math requires for block-math recognition, and updates 7
test expectations to match the corrected behavior.

203/203 pass.
@lightfront

Copy link
Copy Markdown
Contributor Author

Added: fix for display math inside blockquotes + closing-delimiter splitting

While testing, a user reported that a blockquote (>) containing $$...$$ display math caused a cascading KaTeX failure that rendered the entire following paragraph as broken red math. Investigation found two related bugs in mathNormalize.ts:

1. Repair-before-extraction ordering

The repair regex (which inserts \n\n before glued $$) ran before the display-pair extraction. So it split the closing $$ of well-formed pairs like \end{pmatrix}$$ or > $$x$$, breaking display math.

Fix: extract $$…$$ pairs as a unit first, then repair only the remaining orphaned $$. This is the d450aec1 ordering fix applied to current main-v2.

2. Blockquote display-math fence

When $$...$$ appears inside a Markdown blockquote (>), remark-math requires the closing $$ to also carry the > prefix. Without it, the closing fence isn't recognised and remark-math swallows the following paragraph as math content.

Fix: the DM-restore step now detects whether the opening $$ was inside a blockquote and prefixes the closing $$ with > accordingly.

Test status

  • 207/207 pass (was 203, +4 updated expectations for the corrected display-math behavior)
  • Typecheck: 0 errors
  • The full "locally compact / Haar's theorem" text that triggered the bug now renders with 0 errors

…th bug

remark-math has a known limitation: multi-line $$...$$ display math
inside a Markdown blockquote (>) breaks inline math parsing on
subsequent blockquote lines. The closing fence isn't recognised and
inline $...$ after the display block gets swallowed.

Workaround: when display math is detected inside a blockquote, close
the blockquote before the display math (\n\n$$\n...\n$$\n\n> ) and
reopen it after. This puts the math outside the quote, avoiding the
remark-math fence bug entirely, while preserving the visual blockquote
structure.

The Mackey theorem blockquote with inline math after display math now
renders with 0 errors (was: cascading KaTeX failure).

203/203 pass.
@lightfront lightfront changed the title fix(desktop): render LaTeX+digit, group & cycle notation as inline math fix(desktop): inline-math classifier gaps + display-math/blockquote rendering bugs Jun 16, 2026
The classifier rejected all pure numbers (return false) to avoid $5
currency false-positives. But $0$ (mass eigenvalue), $1$, $42$
are extremely common in math/physics and were rendering as literal
dollar signs.

Currency in prose is almost always written without a closing $ (costs
$5, not costs $5$), so the $N$ form almost always means math.

Flip return false → return true for pure numbers. Update 10 test
expectations to match the corrected behavior. 203/203 pass.
…s math

Two classifier gaps in the binary-operator rule:

1. 'K = -iJ' rejected: the regex required an operand ([A-Za-z0-9...])
   immediately after the operator, but '-' (unary sign) was not in
   that class. Fix: allow an optional [+\-]? after the operator so
   'K = -iJ', 'p = +\alpha', 'a = -b' are recognised.

2. '$+$', '$-$', '$=$' rejected: lone operators had no matching rule.
   Fix: add /^[+\-=<>±∓]$/ for single-character operator tokens.

+8 regression tests. 211/211 pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

desktop Wails desktop app (desktop/**) v2 Go rewrite (1.x) — main-v2 branch, active development

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants