-
Notifications
You must be signed in to change notification settings - Fork 222
feat: zero-width patterns #1153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: zero-width patterns #1153
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty straightforward but likely tedious change. Thanks for doing it!
|
I resolved all conflicts. |
| "target_metadata": { | ||
| "verb": {}, | ||
| "verb": { | ||
| "tense": "Present" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Tense" is a syntactic property once words are in sentences and the sentences have been parsed. Words like "run", "walk", "eat" can be future if they come after "will", "shall", "going to", "gonna" or infinitive if they come after "to", etc.
The dictionary and affix system are lexical rather than syntactic. Terms used for this form of verbs are "dictionary form", "citation form", "lemma" - these three apply to other inflected parts of speech too. For verbs specifically "infinitive", literally "no time", "not inflected for tense".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you make this comment here? Did GitHub mess up?
| # For this reason the most important ones to include are the ones that have a noun counterpart. | ||
| # Remember not to add any affixes since only the verb part at the start undergoes inflection. | ||
|
|
||
| back up/4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking about having a special annotation flag for these. It would already be useful in linters. The one I worked on weeks ago dealing with the "backup"/"back up" problem I think finally got merged today and it would benefit already.
But I wasn't sure if the multi-word part of the dictionary would be accepted. I guess it has been. But that's why I'd put them in a separate part of the dictionary. To make best use we would really want some kind of multi-word awareness in parsing and spell checking so I hadn't pushed it any further yet.
This MR contains the following updates: | Package | Update | Change | |---|---|---| | [Automattic/harper/harper-ls](https://github.com/Automattic/harper) | minor | `v0.29.1` -> `v0.34.1` | MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot). **Proposed changes to behavior should be submitted there as MRs.** --- ### Release Notes <details> <summary>Automattic/harper (Automattic/harper/harper-ls)</summary> ### [`v0.34.1`](https://github.com/Automattic/harper/releases/tag/v0.34.1) [Compare Source](Automattic/harper@v0.34.0...v0.34.1) #### What's Changed - feat:definitive article→definite article by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1248 - fix(obsidian): odd padding around message by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1252 - feat: Correct "infront" and other phrases written as a single word by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1192 **Full Changelog**: Automattic/harper@v0.34.0...v0.34.1 ### [`v0.34.0`](https://github.com/Automattic/harper/releases/tag/v0.34.0) [Compare Source](Automattic/harper@v0.33.0...v0.34.0) #### What's Changed - feat: articles implies no verb by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1188 - build(deps): bump hashbrown from 0.15.2 to 0.15.3 by [@​dependabot](https://github.com/dependabot) in Automattic/harper#1230 - fix(obsidian): pretty-print the lint kind by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1235 - test: add snapshots for linters by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1228 - feat: disambiguate prepositions followed by determiners by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1187 - Nominal phrase test helper by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1227 - feat: add single token pattern trait by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1209 - feat:one in the same→one and the same by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1154 - fix(core): long sentence lint should start with word by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1236 - refactor: remove unnecessary Option in PhrasalVerbAsCompoundNoun by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1243 - refactor: don't allocate when linting chunks by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1242 - feat(core): use Rayon in snapshot tests by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1237 - Tweaks for Chrome Extension by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1244 - feat(web): render HTML descriptions on rules page by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1240 - fix(core): don't misplace `ParagraphBreak` tokens by [@​86xsk](https://github.com/86xsk) in Automattic/harper#1239 - fix: fix indefinite article for "utter(ly)" by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1247 - feat: Improved metadata applied by affix system by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1214 **Full Changelog**: Automattic/harper@v0.33.0...v0.34.0 ### [`v0.33.0`](https://github.com/Automattic/harper/releases/tag/v0.33.0) [Compare Source](Automattic/harper@v0.32.1...v0.33.0) #### What's Changed - Create a Chrome extension by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1072 - Documentation updates by [@​mcecode](https://github.com/mcecode) in Automattic/harper#1233 - ci: add `cargo hack` by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1160 - fix(chrome-ext): render on scroll by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1234 - refactor: use `LSend` trait to simplify trait definitions by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1225 - build(deps): bump chrono from 0.4.40 to 0.4.41 by [@​dependabot](https://github.com/dependabot) in Automattic/harper#1231 - fix: "towards" is a preposition by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1213 - perf(core): `PlainEnglish` parser in `ExactPhrase` by [@​86xsk](https://github.com/86xsk) in Automattic/harper#1217 - test(ls): ensure we write sorted dictionaries by [@​86xsk](https://github.com/86xsk) in Automattic/harper#1205 - fix: prevent crash in PhrasalVerbAsCompoundNoun by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1210 - fix: Remove verb property from "of" by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1206 - feat(cli): fallback to `PlainEnglish` parser by [@​86xsk](https://github.com/86xsk) in Automattic/harper#1223 - fix: heuristic for no-contraction-with-verb for "let go" by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1203 - chore: minor typos in two files' comments by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1218 **Full Changelog**: Automattic/harper@v0.32.1...v0.33.0 ### [`v0.32.1`](https://github.com/Automattic/harper/releases/tag/v0.32.1) [Compare Source](Automattic/harper@v0.31.0...v0.32.1) #### What's Changed - feat: aswell→as well by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1161 - Fix false positive in let/lets→let's by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1158 - feat: last ditched/ditch→last-ditch by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1147 - build: add caching by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1172 - feat: implement `assert_any_suggestion_result` as in [#​950](Automattic/harper#950) by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1033 - Add `core_version` to `harper-core` by [@​HobbitJack](https://github.com/HobbitJack) in Automattic/harper#1168 - fix(core): runaway `harper-ls` processes by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1163 - fix(core): ignore whitespace when matching case by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1165 - Rules by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1128 - fix(core): don't lex the last period as part of a number by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1162 - fix([#​1051](Automattic/harper#1051)): Ignore potential ordinal suffixes of length greater than 2 by [@​grantlemons](https://github.com/grantlemons) in Automattic/harper#1054 - docs: fix incorrect documentation for some linters by [@​86xsk](https://github.com/86xsk) in Automattic/harper#1151 - refactor: Rename `ACO` to `Word` by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1149 - fix(core): allow erroneous `shift` usage in spellcheck by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1166 - feat: iterator API for pattern matches by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1155 - feat: Flag phrasal verbs spelled as compound nouns by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#991 - fix(core): improve `Dashes` description by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1174 - Add snapshot tests for token metadata (aka part-of-speech tags) by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1119 - docs(readme): fix grammar and minor rephrasing by [@​86xsk](https://github.com/86xsk) in Automattic/harper#1176 - feat: zero-width patterns by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1153 - chore(harper-core/dictionary): add unpublish(ing) forms with flags by [@​uncenter](https://github.com/uncenter) in Automattic/harper#1180 - feat(core): avoid providing multiple lints for a single long hyphen sequence in `Dashes` by [@​86xsk](https://github.com/86xsk) in Automattic/harper#1182 - feat: once a while→once in a while by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1191 - chore(harper-core/dictionary): add developer-related nouns by [@​uncenter](https://github.com/uncenter) in Automattic/harper#1179 - fix: running individual tests was broken by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1184 - feat: how it looks like→how it looks/what it looks like by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1146 - chore(harper-core/dictionary): add Tree-sitter by [@​uncenter](https://github.com/uncenter) in Automattic/harper#1193 - fix: add exception to possessive_your linter by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1201 - feat(ls): write sorted dictionaries by [@​86xsk](https://github.com/86xsk) in Automattic/harper#1195 - feat: dictionary curation and dialect tests by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1197 - chore(harper-core/dictionary): add manufacturable and favicon by [@​uncenter](https://github.com/uncenter) in Automattic/harper#1194 - fix: exception for "how did you" in how-to linter by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1200 - feat:Guilded Age→Gilded Age; once and a while→once in a while by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1196 - fix: heuristics to fix "comparison to expected results" by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1186 - chore: convert commented out tests to use `#[ignore]` by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1199 - Print version numbers by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1204 #### New Contributors - [@​HobbitJack](https://github.com/HobbitJack) made their first contribution in Automattic/harper#1168 - [@​uncenter](https://github.com/uncenter) made their first contribution in Automattic/harper#1180 **Full Changelog**: Automattic/harper@v0.31.0...v0.32.1 ### [`v0.31.0`](https://github.com/Automattic/harper/releases/tag/v0.31.0) [Compare Source](Automattic/harper@v0.30.0...v0.31.0) #### What's Changed - No invalid `private = true` fields in `Cargo.toml` by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1120 - Dictionary curation 2025 04 22 by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1122 - Flags the definite article used together with a possessive by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1112 - dictionary: Added Schengen by [@​jpds](https://github.com/jpds) in Automattic/harper#1083 - Removed the useless `SequencePattern` in `IndefiniteArticle` by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1125 - Use `Option<NonZeroUsize>` as the return type of `Pattern::matches` by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1124 - fix: Make code compile without `concurrent` feature by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1140 - refactor: removed unused patterns by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1141 - Dictionary curation 2025 04 25 by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1136 - feat: `Document` QoL functions for tokens & words by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1117 - feat: invest into→invest in by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1139 #### New Contributors - [@​jpds](https://github.com/jpds) made their first contribution in Automattic/harper#1083 **Full Changelog**: Automattic/harper@v0.30.0...v0.31.0 ### [`v0.30.0`](https://github.com/Automattic/harper/releases/tag/v0.30.0) [Compare Source](Automattic/harper@v0.29.1...v0.30.0) #### What's Changed - docs: clarify question on FAQ by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1087 - feat: handle some false positives in let/lets/let's/let us by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1085 - feat: have past→have passed by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1088 - build(deps): bump clap from 4.5.36 to 4.5.37 by [@​dependabot](https://github.com/dependabot) in Automattic/harper#1093 - feat: Australian English affix annotation `_` by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1094 - fix: reset vscode version to `1.96.2` by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1084 - Dictionary (and adj-of-a) curation 2025 04 18 by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1090 - Updated `justfile` to check tests and benches by [@​RunDevelopment](https://github.com/RunDevelopment) in Automattic/harper#1098 - feat: worse-case scenario→worst-case scenario by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1104 - docs(obsidian): wrote initial contributor's guide by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1105 - feat: incase→in case by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1103 - feat: adds a `t_ws()` shorthand for `then_whitespace()` in the spirit of `t_aco()` by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1115 - feat: implement dialect indicator for Obsidian plugin by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1102 - Fix Lua syntax in neovim example config by [@​nikosavola](https://github.com/nikosavola) in Automattic/harper#1110 - feat: client's side→client-side, server's side→server side by [@​hippietrail](https://github.com/hippietrail) in Automattic/harper#1109 - fix(obsidian): console contamination by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1116 - feat(core): add several novel rules by [@​elijah-potter](https://github.com/elijah-potter) in Automattic/harper#1108 #### New Contributors - [@​nikosavola](https://github.com/nikosavola) made their first contribution in Automattic/harper#1110 **Full Changelog**: Automattic/harper@v0.29.1...v0.30.0 </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this MR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box --- This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOS4yNTcuOCIsInVwZGF0ZWRJblZlciI6IjQwLjEwLjQiLCJ0YXJnZXRCcmFuY2giOiJtYWluIiwibGFiZWxzIjpbIlJlbm92YXRlIEJvdCJdfQ==-->
Description
This PR adds support for zero-width patterns. I need this to implement lookaheads, lookbehinds, and optional patterns (think regex
?) for #1150.Implementation-wise, I just changed the return type of
Pattern:matchesfromOption<NonZeroUsize>toOption<usize>.How Has This Been Tested?
Existing lints serve as tests
Checklist