[wp-trac] [WordPress Trac] #63724: HTML API: Reliably parse HTML attributes in `wp_kses_hair()`

WordPress Trac noreply at wordpress.org
Sat Jan 10 21:45:25 UTC 2026


#63724: HTML API: Reliably parse HTML attributes in `wp_kses_hair()`
----------------------------------------------------+----------------------
 Reporter:  dmsnell                                 |       Owner:  dmsnell
     Type:  enhancement                             |      Status:  closed
 Priority:  normal                                  |   Milestone:  7.0
Component:  HTML API                                |     Version:  6.9
 Severity:  normal                                  |  Resolution:  fixed
 Keywords:  has-patch has-unit-tests needs-refresh  |     Focuses:
----------------------------------------------------+----------------------
Changes (by dmsnell):

 * status:  assigned => closed
 * resolution:   => fixed


Comment:

 In [changeset:"61467" 61467]:
 {{{
 #!CommitTicketReference repository="" revision="61467"
 HTML API: Refactor `wp_kses_hair()` for spec-compliance.

 `wp_kses_hair()` is built around an impressive state machine for parsing
 the span of text following an HTML tag name and the tag’s closing `>` into
 a structured representation of the attributes. Unfortunately that parsing
 code doesn’t comply with the HTML Living Standard and is prone to mis-
 parsing attributes, particularly in the presence of malformed inputs.

 This patch replaces the existing state machine with the spec-compliant
 parsing from the HTML API. With a comprehensive test suite covering
 attribute parsing, the same reliability the Tag Processor affords will be
 applied to `wp_kses_hair()`, giving new guarantees not previously
 available in Core:

  - All attribute values are reported fully-normalized, where character
 references are decoded and then re-encoded in a predictable manner. Only
 the “big five” syntax characters (“&<>'"”) will remain, and in their named
 forms.
  - All `whole` values are fully normalized and presented either as boolean
 attributes without a value, or with double-quoted attribute values.
  - All attributes and their values will be properly parsed according to
 how a browser would parse them, bringing agreement between the server and
 user agents.

 Developed in https://github.com/WordPress/wordpress-develop/pull/9248
 Discussed in https://core.trac.wordpress.org/ticket/63724

 Props adamziel, dmsnell, jonsurrell, jorbin, westonruter.
 Fixes #63724.
 }}}

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/63724#comment:15>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list