[wp-trac] [WordPress Trac] #63863: Standardize UTF-8 handling and fallbacks in 6.9
WordPress Trac
noreply at wordpress.org
Tue Sep 23 03:34:37 UTC 2025
#63863: Standardize UTF-8 handling and fallbacks in 6.9
--------------------------------------+---------------------
Reporter: dmsnell | Owner: (none)
Type: enhancement | Status: new
Priority: normal | Milestone: 6.9
Component: Charset | Version: trunk
Severity: normal | Resolution:
Keywords: has-patch has-unit-tests | Focuses:
--------------------------------------+---------------------
Comment (by dmsnell):
In [changeset:"60793" 60793]:
{{{
#!CommitTicketReference repository="" revision="60793"
Charset: Improve UTF-8 scrubbing ability via new UTF-8 scanning pipeline.
This is the fourth in a series of patches to modernize and standardize
UTF-8 handling.
`wp_check_invalid_utf8()` has long been dependent on the runtime
configuration of the system running it. This has led to hard-to-diagnose
issues with text containing invalid UTF-8. The function has also had an
apparent defect since its inception: when requesting to strip invalid
bytes it returns an empty string.
This patch updates the function to remove all dependency on the system
running it. It defers to the `mbstring` extension if that’s available,
falling back to the new UTF-8 scanning pipeline.
To support this work, `wp_scrub_utf8()` is created with a proper fallback
so that the remaining logic inside of `wp_check_invalid_utf8()` can be
minimized. The defect in this function has been fixed, but instead of
stripping the invalid bytes it will replace them with the Unicode
replacement character for stronger security guarantees.
Developed in https://github.com/WordPress/wordpress-develop/pull/9498
Discussed in https://core.trac.wordpress.org/ticket/63837
Follow-up to: [60768].
Props askapache, chriscct7, Cyrille37, desrosj, dmsnell, helen,
jonsurrell, kitchin, miqrogroove, pbearne, shailu25.
Fixes #63837, #29717.
See #63863.
}}}
--
Ticket URL: <https://core.trac.wordpress.org/ticket/63863#comment:29>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list