[wp-trac] [WordPress Trac] #64473: Embrace WHATWG Encoding Standards

WordPress Trac noreply at wordpress.org
Sun Jan 4 07:42:37 UTC 2026


#64473: Embrace WHATWG Encoding Standards
-------------------------+--------------------
 Reporter:  dmsnell      |      Owner:  (none)
     Type:  enhancement  |     Status:  new
 Priority:  normal       |  Milestone:  7.0
Component:  Charset      |    Version:
 Severity:  normal       |   Keywords:
  Focuses:               |
-------------------------+--------------------
 Text encoding can be extremely complicated. Worse, it can draw in a wide
 array of security issues. Because of this complexity and because of the
 issues which arise when different systems interpret the same text
 differently, even through such basic actions as using text decoders which
 have different internal behaviors, the WHATWG established the
 [https://encoding.spec.whatwg.org/ Encoding standard].

 This specification standardizes many different aspects of the text data
 flow, including, but not limited to:
  - How can the encoding for a stream of bytes be guessed?
  - When someone says their text is “1252” or “UTF7” or “UTF-8;ASCII” or
 any number of invalid or non-standard declarations, what should the system
 pick as the correct encoding declaration?
  - How should certain security-sensitive encodings be handled?
  - How exactly should certain kinds of errors be handled when decoding
 multibyte characters?

 It also strongly asserts that all systems should ideally use UTF-8 (see
 #62172).

 ----

 The specification is rather short and would provide considerable value to
 the tricky parts of WordPress’ encoding woes.

 It should be designed in a way to answer questions that developers have
 when using WordPress, touching notable parts such as:

  - Parsing HTML when an encoding is uncertain or unknown.
  - Converting text from the database to HTML.
  - Converting text when exporting to WXR.
  - Converting text when existing decoders aren’t available (polyfilling
 conversion).
  - Providing security-sensitive aids to text-handling code.

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/64473>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list