Tree-sitter runtime integration for compare views#3338
Tree-sitter runtime integration for compare views#3338Thorium wants to merge 1 commit intoWinMerge:feature/tree-sitterfrom
Conversation
Wire the runtime grammar bundle, compare-view UI, and same-file navigation together so tree-sitter features are actually available in built binaries. This also updates the F# grammar bundle to include tags and disables Go to Definition when the current caret position cannot resolve.
|
This PR addresses the earlier comment:
|
There was a problem hiding this comment.
Pull request overview
This PR integrates tree-sitter runtime assets and UI wiring so compare views can use tree-sitter features (highlighting, go-to-definition, and tags-based same-file navigation) from built/installed binaries.
Changes:
- Build and package tree-sitter grammar DLLs and bundled query files (
*.scm) into Release outputs and installers (WiX + Inno Setup). - Add an Editor option to enable/disable tree-sitter and wire tree-sitter-aware syntax highlighting + Go to Definition (F12) in compare views.
- Extend tree-sitter runtime to support
tags.scmqueries and use tree-sitter for comment-filtering parse contexts in compare engines.
Reviewed changes
Copilot reviewed 28 out of 28 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| Tools/TreeSitterGrammars/grammars.json | Bumps the F# grammar version. |
| Tools/TreeSitterGrammars/build-grammars.ps1 | Adds bundling support for tags.scm query files. |
| Src/resource.h | Adds resource IDs for the tree-sitter option and Go to Definition command. |
| Src/PropEditor.h | Adds backing field for the tree-sitter editor option. |
| Src/PropEditor.cpp | Binds the tree-sitter option to the Editor property page control. |
| Src/OptionsInit.cpp | Initializes OPT_TREE_SITTER default to enabled. |
| Src/OptionsDef.h | Defines OPT_TREE_SITTER option key. |
| Src/MergeEditView.h | Adds handlers for Go to Definition command UI. |
| Src/MergeEditView.cpp | Implements Go to Definition and tree-sitter TextDefinition selection in RefreshOptions. |
| Src/MergeDoc.h | Exposes IsTreeSitterEnabled() and UpdateTreeSitterSupport(). |
| Src/MergeDoc.cpp | Centralizes per-buffer tree-sitter parser/textdef setup and ties it into doc open + option refresh. |
| Src/MergeAppLib.h | Updates the generated header timestamp line. |
| Src/Merge.vcxproj | Builds tree-sitter grammars into Release outputs and copies them for non-Release builds. |
| Src/Merge.rc | Adds menu items, accelerator (F12), and Editor UI checkbox for tree-sitter. |
| Src/DiffWrapper.h | Adds <vector> include to support new filtering helper APIs. |
| Src/DiffWrapper.cpp | Adds tree-sitter-based comment filtering path using parse contexts. |
| Src/CompareEngines/Wrap_DiffUtils.cpp | Creates tree-sitter parse contexts for comment filtering during compare. |
| Src/CompareEngines/FullQuickCompare.cpp | Creates tree-sitter parse contexts for comment filtering in quick compare flows. |
| Installer/WiX/WinMerge.wxs | Adds an install directory for TreeSitterGrammars. |
| Installer/WiX/WinMerge.wixproj | Adds Heat harvesting of the built TreeSitterGrammars directory into the MSI. |
| Installer/WiX/Fragments/Features.wxs | References the harvested tree-sitter grammars component group in the main feature. |
| Installer/WiX/Config.wxi | Adds TreeSitterBuildDir define for packaging paths. |
| Installer/InnoSetup/WinMergeX64NonAdmin.iss | Installs TreeSitterGrammars\*.dll and *.scm. |
| Installer/InnoSetup/WinMergeARM64.is6.iss | Installs TreeSitterGrammars\*.dll and *.scm. |
| Externals/crystaledit/editlib/TreeSitterWrapper.h | Adds diff-parse-context APIs (but currently introduces a String type conflict). |
| Externals/crystaledit/editlib/TreeSitterWrapper.cpp | Implements diff parse-context creation/destruction and comment-position checks. |
| Externals/crystaledit/editlib/TreeSitterParser.h | Adds tags-query support and go-to-definition APIs. |
| Externals/crystaledit/editlib/TreeSitterParser.cpp | Implements tags query parsing, comment-position checks, and definition lookup. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| <ComponentRef Id="DesktopShortcut" /> | ||
| <ComponentGroupRef Id="TreeSitterGrammarsFiles" /> |
There was a problem hiding this comment.
TreeSitterGrammarsFiles is referenced unconditionally here, but it is only generated by the HarvestDirectory item group when Exists('$(TreeSitterBuildDir)') is true in WinMerge.wixproj. If the directory doesn't exist (e.g., clean tree, Debug config, or grammars not built yet), the WiX linker will fail with a missing ComponentGroup. Make the reference conditional (preprocessor/define) or always generate an empty TreeSitterGrammarsFiles fragment (or remove the Exists(...) condition so the build fails earlier with a clearer error).
| <ComponentRef Id="DesktopShortcut" /> | |
| <ComponentGroupRef Id="TreeSitterGrammarsFiles" /> | |
| <ComponentRef Id="DesktopShortcut" /> | |
| <?ifdef TreeSitterGrammarsFiles ?> | |
| <ComponentGroupRef Id="TreeSitterGrammarsFiles" /> | |
| <?endif?> |
| String text = convertToTString(linbuf[i], linbuf[i + 1]); | ||
| for (int j = 0; j < static_cast<int>(text.size()); ++j) | ||
| { | ||
| const tchar_t ch = text[j]; | ||
| const bool isComment = IsTreeSitterCommentPositionForDiff(parseContext, i, j); | ||
| if (!isComment) | ||
| { | ||
| filteredT += ch; | ||
| if (ch != '\r' && ch != '\n') | ||
| allTextIsComment[i - startLine] = false; | ||
| } |
There was a problem hiding this comment.
GetTreeSitterCommentsFilteredText calls IsTreeSitterCommentPositionForDiff(parseContext, i, j) once per character. That ends up doing a tree-sitter node lookup per char (and string conversions), which will be extremely slow on large files when comment-filtering is enabled. Consider changing this to work on comment ranges (e.g., query all comment nodes once and store byte/char intervals per line) so filtering becomes O(n) over characters without per-char tree-sitter calls.
| pParser->EnsureParsed(pDoc->m_ptBuf[m_nThisPane].get()); | ||
|
|
||
| const CEPoint pos = GetCursorPos(); | ||
| int nDefLine = 0; | ||
| int nDefChar = 0; | ||
| pCmdUI->Enable(pParser->FindDefinition(pos.y, pos.x, nDefLine, nDefChar)); |
There was a problem hiding this comment.
OnUpdateGotoDefinition runs EnsureParsed(...) and then calls FindDefinition(...) during every command UI update. Command UI updates can run very frequently (idle loop), so this can cause noticeable UI lag on large files (especially if the parser is dirty and reparses). Consider gating this behind cheaper checks and caching the last evaluated cursor position / parser version, only recomputing when the caret moves or the parse becomes dirty.
|
|
||
| using String = std::basic_string<TCHAR>; | ||
|
|
There was a problem hiding this comment.
TreeSitterWrapper.h defines using String = std::basic_string<TCHAR> in the global namespace. The project already defines String (e.g., Src/Common/UnicodeString.h), and several TUs include both headers (e.g., Src/DiffWrapper.cpp includes UnicodeString.h before TreeSitterWrapper.h), which will cause a redefinition/compile error. Remove this alias and use the project String type (or put any helper alias in a dedicated namespace with a different name).
|
Thanks for the quick implementation! That said, it may take some time for validation and refactoring, so I expect merging into master will likely happen after the next stable release. |
Summary
Testing
WinMerge.sln /t:Merge /p:Configuration=Release /p:Platform=x64Tools/TreeSitterGrammarsoutput and verifyTreeSitterGrammarsis present besideWinMergeU.exe