[wp-trac] [WordPress Trac] #63724: HTML API: Reliably parse HTML attributes in `wp_kses_hair()`

WordPress Trac noreply at wordpress.org
Fri Jul 18 21:22:20 UTC 2025


#63724: HTML API: Reliably parse HTML attributes in `wp_kses_hair()`
-------------------------+--------------------
 Reporter:  dmsnell      |      Owner:  (none)
     Type:  enhancement  |     Status:  new
 Priority:  normal       |  Milestone:  6.9
Component:  HTML API     |    Version:  trunk
 Severity:  normal       |   Keywords:
  Focuses:               |
-------------------------+--------------------
 `wp_kses_hair()` attempts to parse HTML attributes given the span of text
 inside an HTML tag, but excluding the tag name, opening `<`, and closing
 `>`. For example:

 {{{#!php
 <?php
 $attrs = wp_kses_hair( ' class="description"', wp_allowed_protocols() );
 $attrs === array(
         'class' => array(
                 'name'  => 'class',
                 'value' => 'description',
                 'whole' => 'class="description"',
                 'vless' => 'n',
         )
 );
 }}}

 While this has served WordPress for years, there are fundamental issues in
 the parsing model; namely, that it isn’t spec-compliant with the HTML5
 living standard. Categories of legitimate attribute values are rejected
 and overlooked, while certain values are misinterpreted as something other
 than what they are.

 Ideally, there would be no need for this function:

  - Passing in the string of text inside the HTML tags covered by the
 attributes is awkward and carries forward parsing errors in determining
 what that string is.
  - The output format does not decode attribute values, which passes along
 confusion about whether content is escaped or not.
  - The HTML API provides more efficient and reliable tools to work
 directly with HTML and not have to split it and pass around sub-spans of
 whole HTML tokens.

 However, use of the function is pervasive
 [https://wpdirectory.net/search/01K0FPHS0G71RSX5W295Y22XKY in the plugin
 space] and so the function must remain.

 WordPress should lean on the reliability that the HTML API affords to
 properly parse attributes inside of `wp_kses_hair()` as part of a push
 away from ad-hoc HTML parsing and towards reliance on the HTML API.

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/63724>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list