fix: prevent single-tilde strikethrough false positives#3910
fix: prevent single-tilde strikethrough false positives#3910diadorer wants to merge 3 commits intomarkedjs:masterfrom
Conversation
When a closing single ~ is followed by an alphanumeric character, skip the strikethrough match since it fails GFM right-flanking delimiter rules. This prevents text like `~125 GeV ... **~173 GeV**` from being incorrectly parsed as strikethrough.
|
Someone is attempting to deploy a commit to the MarkedJS Team on Vercel. A member of the Team first needs to authorize it. |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves an issue where the Markdown parser incorrectly applied strikethrough formatting to text containing single tildes used for approximation. The change introduces a specific check during tokenization to ensure that single tilde delimiters adhere to GitHub Flavored Markdown (GFM) right-flanking rules, thereby preventing unintended formatting and improving parsing accuracy. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request addresses a false positive in single-tilde strikethrough parsing, where tildes used to denote approximation (e.g., ~40) were incorrectly identified as strikethrough delimiters. The fix correctly implements a check based on GFM's right-flanking delimiter rules, and a corresponding test case is added to prevent regressions. The approach is sound, and I have one suggestion to improve the implementation for better Unicode support.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Hey Tony! |
|
Great fix! I found this PR because we are seeing a similar issue with another edge case worth considering: tildes appearing in both an approximation prefix and inside a URL text fragment (the ~Approx date: Event description. [↗](https://example.com/article/#:~:text=Some%20text,more%20text.)Here the leading Hopefully that's helpful. I can also try to open a separate PR if that would be more convenient |
|
@LouisTrezzini you can run markdown against this PR with the demo link below to see if it fixes your use case as well. https://marked-website-git-fork-diadorer-fix-single-til-f83acb-markedjs.vercel.app/demo/ |
|
Thanks, this works! Turns out the fix in 17.0.2 was sufficient, we were still on 17.0.0. It might still be worth adding a test case for this, but leaving that to your discretion Thanks again for the help and quick reply |
Summary
~) strikethrough incorrectly matching when tildes are used as "approximately" (e.g.~40 results)Problem
Text like
Each of the ~40 results: **~150 tokens**was incorrectly parsed as strikethrough because the parser matched the first~(before40) as an opening delimiter and the second~(before150) as a closing delimiter.Root cause
This happens because of the GFM flanking delimiter rules. Here is a step-by-step breakdown for
Each of the ~40 results: **~150 tokens**:The first tilde (
~40): followed by an alphanumeric character (4), not preceded by whitespace — valid left-flanking (opening) delimiter. The parser adds it to the stack and starts looking for a closing tilde.The first asterisks (
**~): preceded by a space, followed by punctuation (~) — valid left-flanking delimiter for strong emphasis.The second tilde (
**~150): to close the strikethrough, this tilde must be a valid right-flanking delimiter. Per the GFM spec, a right-flanking delimiter cannot be preceded by a punctuation character unless it is also followed by whitespace or punctuation. This tilde is preceded by*(punctuation) and followed by1(alphanumeric) — it fails the right-flanking test and is not a valid closing delimiter.The final asterisks (
tokens**): preceded by a letter (s) and at end of string — valid right-flanking delimiter, closing the bold emphasis.Since the second tilde can never close the first, the entire strikethrough match is invalid.
Fix
When processing a single-tilde strikethrough match, skip it if the closing
~is immediately followed by an alphanumeric character, as this indicates it fails the GFM right-flanking delimiter rules.Test plan
test/specs/new/del_flanking.md/.html