Skip to content

Conversation

@shreyasminocha
Copy link

@shreyasminocha shreyasminocha commented Oct 10, 2025

Issues

Resolves #56.

Description

Hacked together a quick MVP of a Harper LaTeX parser based on TexLab's crate.

Early comments, suggestions, and pointers are welcome.

Known issues

  • incorrect lint spans after multi-byte codepoints start appearing
  • a newline followed by two (indent) spaces in the middle of a sentence is treated as two spaces
  • \textit{word}s is treated as [word, s]
  • there's a lint for replacing --- but IMO it shouldn't apply to LaTeX since it's rendered as an em dash anyway

@hippietrail
Copy link
Collaborator

I would say that comments should be linted, since that's what's explicity linted in programming languages. But I'm not sure which of the markup languages we support have comments and whether those are linted?

@shreyasminocha
Copy link
Author

How do you recommend I handle the french spaces and dash lints?

For the former, one solution might be to replace [newline, space+] in the harper tokens with just a single space token (with a span size of 1) but I'm not sure if that would cause any other issues.

The latter I don't think really applies to LaTeX either since --- is semantically equivalent to . I could do a string replace on the original document, but that would mess up the spans. I could also just replace three consecutive hyphen (harper) tokens that are preceded and followed by non-hyphens with an em dash token.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LaTeX Support

2 participants