[wp-trac] [WordPress Trac] #38044: Make seems_utf8() RFC 3629 compliant.
WordPress Trac
noreply at wordpress.org
Tue Aug 12 18:14:05 UTC 2025
#38044: Make seems_utf8() RFC 3629 compliant.
--------------------------+-----------------------------
Reporter: gitlost | Owner: dmsnell
Type: defect (bug) | Status: closed
Priority: normal | Milestone: Future Release
Component: Formatting | Version: 1.2.1
Severity: normal | Resolution: fixed
Keywords: has-patch | Focuses:
--------------------------+-----------------------------
Changes (by dmsnell):
* owner: (none) => dmsnell
* status: new => closed
* resolution: => fixed
Comment:
In [changeset:"60630" 60630]:
{{{
#!CommitTicketReference repository="" revision="60630"
Add `wp_is_valid_utf8()` for normalizing UTF-8 checks.
There are several existing mechanisms in Core to determine if a given
string contains valid UTF-8 bytes or not. These are spread out and depend
on which extensions are installed on the running system and what is set
for `blog_charset`. The `seems_utf8()` function is one of these
mechanisms.
`seems_utf8()` does not properly validate UTF-8, unfortunately, and is
slow, and the purpose of the function is veiled behind its name and
historic legacy.
This patch deprecates `seems_utf()` and introduces `wp_is_valid_utf8()`; a
new, spec-compliant, efficient, and focused UTF-8 validator. This new
validator defers to `mb_check_encoding()` where present, otherwise
validating with a pure-PHP implementation. This makes the spec-compliant
validator available on all systems regardless of their runtime
environment.
Developed in https://github.com/WordPress/wordpress-develop/pull/9317
Discussed in https://core.trac.wordpress.org/ticket/38044
Props dmsnell, jonsurrell, jorbin.
Fixes #38044.
}}}
--
Ticket URL: <https://core.trac.wordpress.org/ticket/38044#comment:22>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list