Skip to content

Conversation

@phated
Copy link
Member

@phated phated commented Oct 9, 2023

This is my initial experimentation to figure out if we can use MenhirSdk and the cmly file it can generate to automatically output a tree-sitter grammar.

It's a little annoying because you need to specify the RegExp pattern for each token in the parser, as that logic is actually done by sedlex_ppx in the lexer, but tree-sitter generates a combined lexer/parser given a JavaScript RegExp 😦.

If we can actually get this working, we could possibly throw away all our formatter code and just rely on Topiary.

@spotandjake
Copy link
Member

spotandjake commented Oct 5, 2024

I was looking into this a bit more, and there are two problems with this it seems as though it is really hard to get the input grammar from the cmly file instead we get something more akin to the built tables which are a lot messier this isn't a major road block but certainly makes generation a lot harder. A more pressing issue is that when it comes to syntax highlighting the grammar we currently have isn't really optimal, as treesitter ast's do not really deal with precedence in the same way. Some of my work on this can be found here if it is of interest.

Interestingly a better approach here might to be follow Obelisk and parse the menhir syntax ourself and generate the corresponding tree sitter syntax, we could probably use the parser from obelisk for this. Since this initial work was started we have solved the formatting issue but tree-sitter would be nice for editor support like zed or neovim.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants