[wp-trac] [WordPress Trac] #64054: HTML API: Attribute escaping should escape all HTML entities
WordPress Trac
noreply at wordpress.org
Tue Sep 30 11:01:02 UTC 2025
#64054: HTML API: Attribute escaping should escape all HTML entities
--------------------------+-----------------------------
Reporter: jonsurrell | Owner: (none)
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: HTML API | Version: 6.2
Severity: normal | Keywords:
Focuses: |
--------------------------+-----------------------------
Attribute values set with the HTML API method `set_attribute()`
[https://core.trac.wordpress.org/browser/tags/6.8.2/src/wp-includes/html-
api/class-wp-html-tag-processor.php#L3884 are escaped with] `esc_attr()`.
That function avoids "double encoding" things that look like HTML
character references.
The HTML API should encode whatever it receives, and apply "double
encoding." The HTML API expects to receive plain string inputs and manage
any necessary encoding itself. The fact that "double encoding" is disabled
violates this expectation and makes it difficult correctly to set
attribute values that contain sequences that appear to be HTML character
references.
By contrast, `set_modifiable_text()` does not rely on `esc_html()`
[https://core.trac.wordpress.org/browser/tags/6.8.2/src/wp-includes/html-
api/class-wp-html-tag-processor.php#L3697 and uses] `htmlspecialchars()`
directly. It will encode HTML character references as expected.
The text `&` appears to be an encoded character reference:
{{{#!php
<?php
$amp_text = '&';
$p = WP_HTML_Processor::create_fragment( '<p>x</p>' );
$p->next_tag();
$p->set_attribute( 'data-attr', $amp_text );
$p->next_token();
$p->set_modifiable_text( $amp_text );
echo $p->get_updated_html();
}}}
This prints the following HTML:
{{{#!xml
<p data-attr="&">&</p>
}}}
Notice how the input text is treated differently between an attribute and
text. The HTML encoding of the `&` character is the same in both contexts.
The attribute has the ''value'' `&` instead of the expected `&`. The
text node in the P element correctly renders `&` as expected.
[https://playground.wordpress.net/php-
playground.html#eyJjb2RlIjoiPHN0eWxlPlxucFtkYXRhLWF0dHJdIHtcbiAgd2hpdGUtc3BhY2U6IHByZTtcbiAgJjo6YWZ0ZXIgeyBjb250ZW50OiAnXFxBIEF0dHJpYnV0ZSB2YWx1ZTogXCInIGF0dHIoIGRhdGEtYXR0ciApICdcIic7IH1cbn1cbjwvc3R5bGU+XG48cD5UaGUgZm9sbG93aW5nIGlzIGV4cGVjdGVkIHRvIGRpc3BsYXkgdGhlIHRleHRcbjxiPjxjb2RlPiZhbXA7YW1wOzwvY29kZT48L2I+IGluIGJvdGggY2FzZXMuPC9wPlxuXG48cD5GaXJzdCwgdGhlIEhUTUwgcHJvY2Vzc29yOjwvcD5cblxuPD9waHBcbnJlcXVpcmUgJy93b3JkcHJlc3Mvd3AtbG9hZC5waHAnO1xuXG4kYW1wX3RleHQgPSAnJmFtcDsnO1xuJHAgPSBXUF9IVE1MX1Byb2Nlc3Nvcjo6Y3JlYXRlX2ZyYWdtZW50KCc8cD54PC9wPicpO1xuJHAtPm5leHRfdGFnKCk7XG4kcC0+c2V0X2F0dHJpYnV0ZSgnZGF0YS1hdHRyJywgJGFtcF90ZXh0KTtcbiRwLT5uZXh0X3Rva2VuKCk7XG4kcC0+c2V0X21vZGlmaWFibGVfdGV4dChcIkhUTUwgdGV4dDogXFxcInskYW1wX3RleHR9XFxcIlwiKTtcbmVjaG8gJHAtPmdldF91cGRhdGVkX2h0bWwoKTtcbj8+XG48YnI+XG5BbmQgYWdhaW4gd2l0aCB0aGUgdGFnIHByb2Nlc3Nvcjpcbjw/cGhwXG4kcCA9IG5ldyBXUF9IVE1MX1RhZ19Qcm9jZXNzb3IoJzxwPng8L3A+Jyk7XG4kcC0+bmV4dF90YWcoKTtcbiRwLT5zZXRfYXR0cmlidXRlKCdkYXRhLWF0dHInLCAkYW1wX3RleHQpO1xuJHAtP
m5leHRfdG9rZW4oKTtcbiRwLT5zZXRfbW9kaWZpYWJsZV90ZXh0KFwiSFRNTCB0ZXh0OiBcXFwieyRhbXBfdGV4dH1cXFwiXCIpO1xuZWNobyAkcC0+Z2V0X3VwZGF0ZWRfaHRtbCgpO1xuIiwicGhwIjoiOC40Iiwid3AiOiI2LjgifQ==
Here's a demo of the difference in behavior between setting attributes and
modifiable text.]
--
Ticket URL: <https://core.trac.wordpress.org/ticket/64054>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list