Skip to content

regexp: mitigate ReDoS on ASCII inputs with PikeVM path#4698

Open
Flamki wants to merge 3 commits intoboa-dev:mainfrom
Flamki:fix/regexp-redos-mitigation-4643
Open

regexp: mitigate ReDoS on ASCII inputs with PikeVM path#4698
Flamki wants to merge 3 commits intoboa-dev:mainfrom
Flamki:fix/regexp-redos-mitigation-4643

Conversation

@Flamki
Copy link

@Flamki Flamki commented Feb 23, 2026

This Pull Request mitigates #4643.

It changes the following:

  • Adds a dedicated non-optimized regress::Regex matcher for PikeVM execution in RegExp objects.
  • Uses PikeVM matching for ASCII inputs in RegExpBuiltinExec, avoiding pathological backtracking in common ReDoS cases such as (a+)+$.
  • Keeps existing UTF-16/UCS-2 backtracking paths for non-ASCII inputs to preserve current encoding semantics.
  • Adds a regression test covering the nested-quantifier ASCII case.

Notes:

  • This is a mitigation path inside Boa while the broader backtracking-budget API is still blocked in regress.

@github-actions
Copy link

github-actions bot commented Feb 23, 2026

Test262 conformance changes

Test result main count PR count difference
Total 52,862 52,862 0
Passed 49,504 49,503 -1
Ignored 2,262 2,262 0
Failed 1,096 1,097 +1
Panics 0 0 0
Conformance 93.65% 93.65% -0.00%
Broken tests (1):
test/staging/sm/RegExp/unicode-raw.js (previously Passed)

@jedel1043 jedel1043 added bug Something isn't working builtins PRs and Issues related to builtins/intrinsics waiting-on-author Waiting on PR changes from the author labels Feb 25, 2026
@jedel1043 jedel1043 added this to the v1.0.0 milestone Feb 25, 2026
@Flamki Flamki force-pushed the fix/regexp-redos-mitigation-4643 branch from ff48371 to 63d6611 Compare February 25, 2026 06:43
@Flamki
Copy link
Author

Flamki commented Feb 25, 2026

Follow-up fix pushed in 63d6611.

Addresses the CI failures from the previous run:

  • Added explicit lifetime args for
    egress::backends::PikeVMExecutor<', '> at both ind call sites (fixes elided-lifetimes-in-paths under -D warnings).
  • Updated doc comments to use backticks around PikeVM (fixes clippy::doc-markdown).
  • Applied rustfmt-compliant line wrapping in the ASCII UTF-16 conversion path.

This should resolve the build/lint/fmt failures that were cascading across jobs.

@codecov
Copy link

codecov bot commented Feb 26, 2026

Codecov Report

❌ Patch coverage is 60.00000% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.07%. Comparing base (6ddc2b4) to head (63d6611).
⚠️ Report is 688 commits behind head on main.

Files with missing lines Patch % Lines
core/engine/src/builtins/regexp/mod.rs 60.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4698      +/-   ##
==========================================
+ Coverage   47.24%   57.07%   +9.82%     
==========================================
  Files         476      549      +73     
  Lines       46892    60311   +13419     
==========================================
+ Hits        22154    34421   +12267     
- Misses      24738    25890    +1152     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Flamki Flamki force-pushed the fix/regexp-redos-mitigation-4643 branch from 63d6611 to 91d9da4 Compare February 26, 2026 09:53
@Flamki Flamki requested a review from a team as a code owner February 26, 2026 09:53
@Flamki
Copy link
Author

Flamki commented Feb 26, 2026

@jedel1043 I did a full cleanup pass on #4698 so it avoids the same issues we hit in earlier PRs.

rebased the branch onto latest main, kept the PikeVM ReDoS mitigation logic intact, and preserved the CI/lint fixes (explicit PikeVMExecutor<'_, '_> lifetimes, doc markdown fixes, and rustfmt-compliant formatting).

  • also added one extra regression test to cover the ASCII-in-UTF16 execution path, not just the plain ASCII string path, so coverage is better aligned with the new matcher behavior.

Current branch head: 91d9da41.
Ready for CI + maintainer review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working builtins PRs and Issues related to builtins/intrinsics waiting-on-author Waiting on PR changes from the author

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants