[wp-trac] [WordPress Trac] #63863: Standardize UTF-8 handling and fallbacks in 6.9
WordPress Trac
noreply at wordpress.org
Sun Aug 31 05:52:46 UTC 2025
#63863: Standardize UTF-8 handling and fallbacks in 6.9
-------------------------+---------------------
Reporter: dmsnell | Owner: (none)
Type: enhancement | Status: new
Priority: normal | Milestone: 6.9
Component: Charset | Version: trunk
Severity: normal | Resolution:
Keywords: has-patch | Focuses:
-------------------------+---------------------
Comment (by dmsnell):
@tusharbharti thanks for linking that. I’ll try and think it through and
see how to combine it. there’s definitely some limit to poly-filling the
`mbstring` extension that I’m trying to balance, and much or most of that
is focused on UTF-8.
----
otherwise there’s a funny thing about [https://github.com/WordPress
/wordpress-develop/pull/9678 #9678]. it’s worth doing, and I considered
creating a ticket like “ensure that code calling PCRE functions with the
`u` flag rely on `_wp_can_use_pcre_u()`” but as I looked around, almost
everything in Core just assumes the flag is present and breaks when it
isn’t.
This is part of what I mean when I talk at times about WordPress already
being broken in non-UTF-8 environments and I think we could have a lot of
un-reported or under-reported bugs that derive from places where the regex
fails. it would be incredibly difficult to pin the reported behavior back
onto the regex failure.
A good example of this might be `shortcode_parse_atts()`, where a call to
`preg_replace()` returns `NULL` instead of a string when support is
missing. I don’t know what to do with these functions when support is
lacking other than tackle them one at a time and in unique ways for each
one.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/63863#comment:7>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list