Fix CSS selector parsing with tab characters#2269
Fix CSS selector parsing with tab characters#2269TFBW wants to merge 1 commit intomojolicious:mainfrom
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR fixes CSS selector parsing in Mojo::DOM::CSS to properly handle tab characters in CSS selectors, ensuring tabs are treated consistently with spaces as whitespace separators.
- Updated CSS selector regex patterns to include tab characters (
\t) alongside spaces - Added comprehensive test coverage for various tab character scenarios in selectors
- Fixed inconsistent handling where tabs were incorrectly included in some patterns but excluded from combinators
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| lib/Mojo/DOM/CSS.pm | Updated regex patterns to properly handle tab characters in CSS selector parsing |
| t/mojo/dom.t | Added test cases to verify tab character handling in CSS selectors |
| EOF | ||
| for my $selector ("ul li", "ul\tli", "ul \tli", "ul\t li") { | ||
| is_deeply $dom->find($selector)->map(sub { $_->to_string })->to_array, ['<li>Ax1</li>'], | ||
| 'selector "' . $selector =~ s/\t/\\t/r . '"'; |
There was a problem hiding this comment.
[nitpick] The regex substitution pattern s/\t/\\t/r is confusing with multiple backslashes. Consider using a more readable approach like $selector =~ tr/\t/t/ with 'tab' text or storing the display string separately.
| 'selector "' . $selector =~ s/\t/\\t/r . '"'; | |
| 'selector "' . ($selector =~ tr/\t/t/r =~ s/^/\\/r) . '"'; |
lib/Mojo/DOM/CSS.pm
Outdated
| return _sibling($selectors, $current, $tree, $scope, 1, ++$pos) if $c eq '+'; | ||
|
|
||
| # " " (ancestor) | ||
| # " " or "\t" (ancestor) |
t/mojo/dom.t
Outdated
| like $@, qr/Unknown CSS selector: p\[/, 'right error'; | ||
| }; | ||
|
|
||
| # Github Issue #2024 |
There was a problem hiding this comment.
That comment serves no real purpose
|
Please squash commits. |
|
There's two commits still. |
|
Clearly, merging up main was a bad idea and my git-fu is weak. |
|
Just rebase on main and force push. |
|
Might have to sacrifice a goat just to be sure, but I'll try. |
Summary
Fixes Mojo::DOM::CSS not handling certain tab characters correctly.
Motivation
Tabs were being incorrectly included in the Class/ID pattern and Tag pattern, and not included in the Combinator pattern.
References
Fixes #2024