[wp-trac] [WordPress Trac] #63863: Standardize UTF-8 handling and fallbacks in 6.9

WordPress Trac noreply at wordpress.org
Sun Aug 31 05:52:46 UTC 2025


#63863: Standardize UTF-8 handling and fallbacks in 6.9
-------------------------+---------------------
 Reporter:  dmsnell      |       Owner:  (none)
     Type:  enhancement  |      Status:  new
 Priority:  normal       |   Milestone:  6.9
Component:  Charset      |     Version:  trunk
 Severity:  normal       |  Resolution:
 Keywords:  has-patch    |     Focuses:
-------------------------+---------------------

Comment (by dmsnell):

 @tusharbharti thanks for linking that. I’ll try and think it through and
 see how to combine it. there’s definitely some limit to poly-filling the
 `mbstring` extension that I’m trying to balance, and much or most of that
 is focused on UTF-8.

 ----

 otherwise there’s a funny thing about [https://github.com/WordPress
 /wordpress-develop/pull/9678 #9678]. it’s worth doing, and I considered
 creating a ticket like “ensure that code calling PCRE functions with the
 `u` flag rely on `_wp_can_use_pcre_u()`” but as I looked around, almost
 everything in Core just assumes the flag is present and breaks when it
 isn’t.

 This is part of what I mean when I talk at times about WordPress already
 being broken in non-UTF-8 environments and I think we could have a lot of
 un-reported or under-reported bugs that derive from places where the regex
 fails. it would be incredibly difficult to pin the reported behavior back
 onto the regex failure.

 A good example of this might be `shortcode_parse_atts()`, where a call to
 `preg_replace()` returns `NULL` instead of a string when support is
 missing. I don’t know what to do with these functions when support is
 lacking other than tackle them one at a time and in unique ways for each
 one.

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/63863#comment:7>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list