[wp-trac] [WordPress Trac] #63863: Standardize UTF-8 handling and fallbacks in 6.9

WordPress Trac noreply at wordpress.org
Thu Oct 16 20:58:46 UTC 2025


#63863: Standardize UTF-8 handling and fallbacks in 6.9
--------------------------------------+---------------------
 Reporter:  dmsnell                   |       Owner:  (none)
     Type:  enhancement               |      Status:  new
 Priority:  normal                    |   Milestone:  6.9
Component:  Charset                   |     Version:  trunk
 Severity:  normal                    |  Resolution:
 Keywords:  has-patch has-unit-tests  |     Focuses:
--------------------------------------+---------------------

Comment (by dmsnell):

 In [changeset:"60949" 60949]:
 {{{
 #!CommitTicketReference repository="" revision="60949"
 Charset: Rely on new UTF-8 pipeline for mb_strlen() fallback.

 The existing polyfill for `mb_strlen()` contains a number of issues
 leaving plenty of opportunity for improvement. Specifically, the following
 are all deficiencies: it relies on Unicode PCRE support, assumes input
 strings are valid UTF-8, splits input strings into an array of character
 to count them (1,000 at a time, iterating until complete), and entirely
 gives up when the Unicode support is missing.

 This patch provides an updated polyfill which will reliably count code
 points in a UTF-8 string, even in the presence of sequences of invalid
 bytes. It scans through the input with zero allocations. Additionally, the
 underlying fallback extends the behavior of `mb_strlen()` to provide
 character counts for substrings within a larger input without extracting
 the substring (it can counts characters within a byte offset and length of
 a larger string).

 This change improves the reliability of UTF-8 string length calculations
 and removes behavioral variability based on the runtime system.

 Developed in https://github.com/WordPress/wordpress-develop/pull/9828
 Discussed in https://core.trac.wordpress.org/ticket/63863

 See #63863.
 }}}

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/63863#comment:34>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list