Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
80c5983
Add test setup for running Core tests.
dmsnell Jun 16, 2024
0e1a917
Add quirks mode parameter to create_fragment
sirreal Jul 18, 2024
5267774
Add standards mode tests for class name case sensitivity
sirreal Jul 19, 2024
ed8f34f
Implement has_class
sirreal Jul 19, 2024
f2fa469
Add test with null bytes in class attribute
sirreal Jul 19, 2024
c103410
Remove lower-casing behavior of class_list
sirreal Jul 19, 2024
1435cf9
Update to use document_mode
sirreal Aug 12, 2024
4f18368
Replace null bytes in class_list class names
sirreal Aug 13, 2024
2540406
Fix tests
sirreal Aug 13, 2024
9be0a32
Handle all the comparison stuff with a protected comparable_class_nam…
sirreal Aug 13, 2024
cd43ef9
Lint
sirreal Aug 13, 2024
00e8aff
Improve phpdoc explanations
sirreal Aug 13, 2024
eb07339
Remove comment about styling (HTML structure is affected)
sirreal Aug 13, 2024
a9e924d
Revert "Replace null bytes in class_list class names"
sirreal Aug 13, 2024
1dc1752
Remove null-byte has_class test
sirreal Aug 13, 2024
8dc79bc
Merge branch 'trunk' into html-api/css-class-name-method-audit
sirreal Aug 23, 2024
be6091b
Revert ::create_fragment changes
sirreal Aug 23, 2024
07726b1
Adjust tests to use full parser in quirks/no-quirks
sirreal Aug 23, 2024
ec05abb
Move quirks mode to tag processor
sirreal Aug 23, 2024
176604c
Make comparable_class_name internal
sirreal Aug 23, 2024
bb6d772
Add information about how quirks mode changes behavior
sirreal Aug 26, 2024
15d479e
Remove comparable_class_name function call from has_class loop
sirreal Aug 26, 2024
8dbf6ab
Remove comparable_class_name method entirely
sirreal Aug 26, 2024
926e7d6
Add another test for quirks-mode add_class
sirreal Aug 26, 2024
f1f4224
Remove since tag from class_list
sirreal Aug 26, 2024
559315d
Lowerclass yielded class_list class names in quirks mode
sirreal Sep 2, 2024
df91415
Fix modifying class case when removing another class
sirreal Sep 2, 2024
21534f9
Merge branch 'trunk' into html-api/css-class-name-method-audit
sirreal Sep 3, 2024
0eaa9af
Fix equals sign alignment lint
sirreal Sep 3, 2024
3eaecd1
Merge branch 'trunk' into html-api/css-class-name-method-audit
dmsnell Sep 3, 2024
ab02ca4
Reintroduce an explanatory comment on `compat_mode` property
dmsnell Sep 3, 2024
cad5d62
Revert: Have `has_class()` return `null` for unsupported tokens.
dmsnell Sep 3, 2024
62cbb1d
Truncate explanatory comment on quirks mode constants and cite MDN.
dmsnell Sep 3, 2024
8217303
Modify tests
dmsnell Sep 3, 2024
b417e11
Preserve given casing of added and removed class names.
dmsnell Sep 4, 2024
618eec5
Merge branch 'trunk' into html-api/css-class-name-method-audit
dmsnell Sep 4, 2024
4c8db57
Merge branch 'test-everything' into html-api/css-class-name-method-audit
dmsnell Sep 4, 2024
48e00c9
Fix issue with classname updates, and change test to assert it.
dmsnell Sep 4, 2024
b099026
Remove test helpers accidentally added (can't force-push)
dmsnell Sep 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 0 additions & 49 deletions src/wp-includes/html-api/class-wp-html-processor-state.php
Original file line number Diff line number Diff line change
Expand Up @@ -299,31 +299,6 @@ class WP_HTML_Processor_State {
*/
const INSERTION_MODE_AFTER_AFTER_FRAMESET = 'insertion-mode-after-after-frameset';

/**
* No-quirks mode document compatability mode.
*
* > In no-quirks mode, the behavior is (hopefully) the desired behavior
* > described by the modern HTML and CSS specifications.
*
* @since 6.7.0
*
* @var string
*/
const NO_QUIRKS_MODE = 'no-quirks-mode';

/**
* Quirks mode document compatability mode.
*
* > In quirks mode, layout emulates behavior in Navigator 4 and Internet
* > Explorer 5. This is essential in order to support websites that were
* > built before the widespread adoption of web standards.
*
* @since 6.7.0
*
* @var string
*/
const QUIRKS_MODE = 'quirks-mode';

/**
* The stack of template insertion modes.
*
Expand Down Expand Up @@ -381,30 +356,6 @@ class WP_HTML_Processor_State {
*/
public $insertion_mode = self::INSERTION_MODE_INITIAL;

/**
* Indicates if the document is in quirks mode or no-quirks mode.
*
* Impact on HTML parsing:
*
* - In `NO_QUIRKS_MODE` CSS class and ID selectors match in a byte-for-byte
* manner, otherwise for backwards compatability, class selectors are to
* match in an ASCII case-insensitive manner.
*
* - When not in `QUIRKS_MODE`, a TABLE start tag implicitly closes an open P tag
* if one is in scope and open, otherwise the TABLE becomes a child of the P.
*
* `QUIRKS_MODE` impacts many styling-related aspects of an HTML document, but
* none of the other changes modifies how the HTML is parsed or selected.
Comment on lines -396 to -397
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lines immediately above this about P > TABLE handling in quirks/no-quirks modes seem contradictory. That's directly related to tree-construction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the wording. It indeed was self-contradictory

*
* @see self::QUIRKS_MODE
* @see self::NO_QUIRKS_MODE
*
* @since 6.7.0
*
* @var string
*/
public $document_mode = self::NO_QUIRKS_MODE;

/**
* Context node initializing fragment parser, if created as a fragment parser.
*
Expand Down
10 changes: 7 additions & 3 deletions src/wp-includes/html-api/class-wp-html-processor.php
Original file line number Diff line number Diff line change
Expand Up @@ -1080,7 +1080,7 @@ private function step_initial(): bool {
case 'html':
$doctype = $this->get_doctype_info();
if ( null !== $doctype && 'quirks' === $doctype->indicated_compatability_mode ) {
$this->state->document_mode = WP_HTML_Processor_State::QUIRKS_MODE;
$this->compat_mode = WP_HTML_Tag_Processor::QUIRKS_MODE;
}

/*
Expand All @@ -1095,7 +1095,7 @@ private function step_initial(): bool {
* > Anything else
*/
initial_anything_else:
$this->state->document_mode = WP_HTML_Processor_State::QUIRKS_MODE;
$this->compat_mode = WP_HTML_Tag_Processor::QUIRKS_MODE;
$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML;
return $this->step( self::REPROCESS_CURRENT_NODE );
}
Expand Down Expand Up @@ -2448,7 +2448,7 @@ private function step_in_body(): bool {
* > has a p element in button scope, then close a p element.
*/
if (
WP_HTML_Processor_State::QUIRKS_MODE !== $this->state->document_mode &&
WP_HTML_Tag_Processor::QUIRKS_MODE !== $this->compat_mode &&
$this->state->stack_of_open_elements->has_p_in_button_scope()
) {
$this->close_a_p_element();
Expand Down Expand Up @@ -4938,6 +4938,10 @@ public function remove_class( $class_name ): bool {
*
* @since 6.6.0 Subclassed for the HTML Processor.
*
* @todo When reconstructing active formatting elements with attributes, find a way
* to indicate if the virtually-reconstructed formatting elements contain the
* wanted class name.
*
* @param string $wanted_class Look for this CSS class name, ASCII case-insensitive.
* @return bool|null Whether the matched tag contains the given class name, or null if not matched.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sirreal I've reverted this change so we can consider it separately. I know it will be important during active format reconstruction, but I think that null is a kind of partially-implemented escape hatch that communicates that this isn't supported rather than indicating that a class definitively doesn't exist on the tag.

Expand Down
167 changes: 142 additions & 25 deletions src/wp-includes/html-api/class-wp-html-tag-processor.php
Original file line number Diff line number Diff line change
Expand Up @@ -511,6 +511,32 @@ class WP_HTML_Tag_Processor {
*/
protected $parser_state = self::STATE_READY;

/**
* Indicates if the document is in quirks mode or no-quirks mode.
*
* Impact on HTML parsing:
*
* - In `NO_QUIRKS_MODE` (also known as "standard mode"):
* - CSS class and ID selectors match byte-for-byte (case-sensitively).
* - A TABLE start tag `<table>` implicitly closes any open `P` element.
*
* - In `QUIRKS_MODE`:
* - CSS class and ID selectors match match in an ASCII case-insensitive manner.
* - A TABLE start tag `<table>` opens a `TABLE` element as a child of a `P`
* element if one is open.
*
* Quirks and no-quirks mode are thus mostly about styling, but have an impact when
* tables are found inside paragraph elements.
*
* @see self::QUIRKS_MODE
* @see self::NO_QUIRKS_MODE
*
* @since 6.7.0
*
* @var string
*/
protected $compat_mode = self::NO_QUIRKS_MODE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not bring over the previous comment here, and update it with the correction to the contradictory statement you mentioned?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is bb6d772 sufficient or do you think there should be more information on the property?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having it on the constants is a good requirement. I can still see value in having at least a summary here, though I know we're not consistent everywhere in this. what I'm thinking about is someone hovering over the $compat_mode property and seeing documentation appear inline. for them, having a basic summary could help quickly inform them of whether or not they need to care about this. some IDEs do make it easy to follow the @see links, but not all do, and source code readers don't.

let me stew on it. maybe I can work at it some as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've brought the comment over and updated the wording to make it clearer.

wanted to copy it here so that when people hover over compat_mode they get an instant clue on what it is and how it affects things


/**
* Indicates whether the parser is inside foreign content,
* e.g. inside an SVG or MathML element.
Expand Down Expand Up @@ -1155,6 +1181,8 @@ public function class_list() {

$seen = array();

$is_quirks = self::QUIRKS_MODE === $this->compat_mode;

$at = 0;
while ( $at < strlen( $class ) ) {
// Skip past any initial boundary characters.
Expand All @@ -1169,13 +1197,11 @@ public function class_list() {
return;
}

/*
* CSS class names are case-insensitive in the ASCII range.
*
* @see https://www.w3.org/TR/CSS2/syndata.html#x1
*/
$name = str_replace( "\x00", "\u{FFFD}", strtolower( substr( $class, $at, $length ) ) );
$at += $length;
$name = str_replace( "\x00", "\u{FFFD}", substr( $class, $at, $length ) );
if ( $is_quirks ) {
$name = strtolower( $name );
}
$at += $length;

/*
* It's expected that the number of class names for a given tag is relatively small.
Expand Down Expand Up @@ -1205,10 +1231,14 @@ public function has_class( $wanted_class ): ?bool {
return null;
}

$wanted_class = strtolower( $wanted_class );
$case_insensitive = self::QUIRKS_MODE === $this->compat_mode;

$wanted_length = strlen( $wanted_class );
foreach ( $this->class_list() as $class_name ) {
if ( $class_name === $wanted_class ) {
if (
strlen( $class_name ) === $wanted_length &&
0 === substr_compare( $class_name, $wanted_class, 0, strlen( $wanted_class ), $case_insensitive )
) {
return true;
}
}
Expand Down Expand Up @@ -2296,6 +2326,23 @@ private function class_name_updates_to_attributes_updates(): void {
*/
$modified = false;

$seen = array();
$to_remove = array();
$is_quirks = self::QUIRKS_MODE === $this->compat_mode;
if ( $is_quirks ) {
foreach ( $this->classname_updates as $updated_name => $action ) {
if ( self::REMOVE_CLASS === $action ) {
$to_remove[] = strtolower( $updated_name );
}
}
} else {
foreach ( $this->classname_updates as $updated_name => $action ) {
if ( self::REMOVE_CLASS === $action ) {
$to_remove[] = $updated_name;
}
}
}

// Remove unwanted classes by only copying the new ones.
$existing_class_length = strlen( $existing_class );
while ( $at < $existing_class_length ) {
Expand All @@ -2311,25 +2358,23 @@ private function class_name_updates_to_attributes_updates(): void {
break;
}

$name = substr( $existing_class, $at, $name_length );
$at += $name_length;

// If this class is marked for removal, start processing the next one.
$remove_class = (
isset( $this->classname_updates[ $name ] ) &&
self::REMOVE_CLASS === $this->classname_updates[ $name ]
);
$name = substr( $existing_class, $at, $name_length );
$comparable_class_name = $is_quirks ? strtolower( $name ) : $name;
$at += $name_length;

// If a class has already been seen then skip it; it should not be added twice.
if ( ! $remove_class ) {
$this->classname_updates[ $name ] = self::SKIP_CLASS;
// If this class is marked for removal, remove it and move on to the next one.
if ( in_array( $comparable_class_name, $to_remove, true ) ) {
$modified = true;
continue;
}

if ( $remove_class ) {
$modified = true;
// If a class has already been seen then skip it; it should not be added twice.
if ( in_array( $comparable_class_name, $seen, true ) ) {
continue;
}

$seen[] = $comparable_class_name;

/*
* Otherwise, append it to the new "class" attribute value.
*
Expand All @@ -2350,7 +2395,8 @@ private function class_name_updates_to_attributes_updates(): void {

// Add new classes by appending those which haven't already been seen.
foreach ( $this->classname_updates as $name => $operation ) {
if ( self::ADD_CLASS === $operation ) {
$comparable_name = $is_quirks ? strtolower( $name ) : $name;
if ( self::ADD_CLASS === $operation && ! in_array( $comparable_name, $seen, true ) ) {
$modified = true;

$class .= strlen( $class ) > 0 ? ' ' : '';
Expand Down Expand Up @@ -3932,8 +3978,29 @@ public function add_class( $class_name ): bool {
return false;
}

$this->classname_updates[ $class_name ] = self::ADD_CLASS;
if ( self::QUIRKS_MODE !== $this->compat_mode ) {
$this->classname_updates[ $class_name ] = self::ADD_CLASS;
return true;
}

/*
* Because class names are matched ASCII-case-insensitively in quirks mode,
* this needs to see if a case variant of the given class name is already
* enqueued and update that existing entry, if so. This picks the casing of
* the first-provided class name for all lexical variations.
*/
$class_name_length = strlen( $class_name );
foreach ( $this->classname_updates as $updated_name => $action ) {
if (
strlen( $updated_name ) === $class_name_length &&
0 === substr_compare( $updated_name, $class_name, 0, $class_name_length, true )
) {
$this->classname_updates[ $updated_name ] = self::ADD_CLASS;
return true;
}
}

$this->classname_updates[ $class_name ] = self::ADD_CLASS;
return true;
}

Expand All @@ -3953,10 +4020,29 @@ public function remove_class( $class_name ): bool {
return false;
}

if ( null !== $this->tag_name_starts_at ) {
if ( self::QUIRKS_MODE !== $this->compat_mode ) {
$this->classname_updates[ $class_name ] = self::REMOVE_CLASS;
return true;
}

/*
* Because class names are matched ASCII-case-insensitively in quirks mode,
* this needs to see if a case variant of the given class name is already
* enqueued and update that existing entry, if so. This picks the casing of
* the first-provided class name for all lexical variations.
*/
$class_name_length = strlen( $class_name );
foreach ( $this->classname_updates as $updated_name => $action ) {
if (
strlen( $updated_name ) === $class_name_length &&
0 === substr_compare( $updated_name, $class_name, 0, $class_name_length, true )
) {
$this->classname_updates[ $updated_name ] = self::REMOVE_CLASS;
return true;
}
}

$this->classname_updates[ $class_name ] = self::REMOVE_CLASS;
return true;
}

Expand Down Expand Up @@ -4350,6 +4436,37 @@ public function get_doctype_info(): ?WP_HTML_Doctype_Info {
*/
const COMMENT_AS_INVALID_HTML = 'COMMENT_AS_INVALID_HTML';

/**
* No-quirks mode document compatability mode.
*
* > In no-quirks mode, the behavior is (hopefully) the desired behavior
* > described by the modern HTML and CSS specifications.
Copy link
Member

@dmsnell dmsnell Aug 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whence comes this quote? and what does "hopefully" communicate to someone trying to figure out the purpose of this mode?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even though the Tag Processor doesn't parse HTML differently because of these, it would be nice to still discuss the impact of TABLE elements closing an open P. these constants will still provide hover-documentation inside the HTML Processor code, and could inform someone, who is poking around and reading the docs, that this has such an impact.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whence comes this quote? and what does "hopefully" communicate to someone trying to figure out the purpose of this mode?

This PR moves this, it was introduced in Processor State in #6972 / [58779]. I'm not sure where it came from originally.

even though the Tag Processor doesn't parse HTML differently because of these, it would be nice to still discuss the impact of TABLE elements closing an open P. these constants will still provide hover-documentation inside the HTML Processor code, and could inform someone, who is poking around and reading the docs, that this has such an impact.

I struggled with how to document different behavior without being confusing or too length. I'll add some notes again and see what you think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See bb6d772 for notes about quirks-mode.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found it; this is why I need review and scrutiny on my own contributions. 🙃

https://developer.mozilla.org/en-US/docs/Web/HTML/Quirks_Mode_and_Standards_Mode

*
* @see self::$compat_mode
* @see https://developer.mozilla.org/en-US/docs/Web/HTML/Quirks_Mode_and_Standards_Mode
*
* @since 6.7.0
*
* @var string
*/
const NO_QUIRKS_MODE = 'no-quirks-mode';

/**
* Quirks mode document compatability mode.
*
* > In quirks mode, layout emulates behavior in Navigator 4 and Internet
* > Explorer 5. This is essential in order to support websites that were
* > built before the widespread adoption of web standards.
*
* @see self::$compat_mode
* @see https://developer.mozilla.org/en-US/docs/Web/HTML/Quirks_Mode_and_Standards_Mode
*
* @since 6.7.0
*
* @var string
*/
const QUIRKS_MODE = 'quirks-mode';

/**
* Indicates that a span of text may contain any combination of significant
* kinds of characters: NULL bytes, whitespace, and others.
Expand Down
Loading