[wp-trac] [WordPress Trac] #65342: Charset: Polyfill mb_ord() and mb_chr() for UTF-8
WordPress Trac
noreply at wordpress.org
Thu May 28 06:21:28 UTC 2026
#65342: Charset: Polyfill mb_ord() and mb_chr() for UTF-8
--------------------------------------+----------------------
Reporter: dmsnell | Owner: dmsnell
Type: enhancement | Status: closed
Priority: normal | Milestone: 7.1
Component: Charset | Version:
Severity: normal | Resolution: fixed
Keywords: has-patch has-unit-tests | Focuses:
--------------------------------------+----------------------
Comment (by dmsnell):
In [changeset:"62425" 62425]:
{{{
#!CommitTicketReference repository="" revision="62425"
Charset: Update antispambot to handle multibyte characters.
In preparation for handling Unicode email addresses (non-US-ASCII
characters in the mailbox name), the `antispambot()` function needs to
be multi-byte aware so that it creates proper HTML numeric character
references and percent-encoded strings.
Previously it has been scanning the input email address byte-by-byte,
but with multibyte characters this will produce invalid sequences of the
transformations by encoding individual bytes of a multi-byte sequence as
if they were whole characters on their own.
This patch relies on the newly-polyfilled `mb_ord()` function and the
`_wp_scan_utf8()` function to crawl through an input email by code
point, assuming UTF-8 encoding. This ensures proper transformation.
Developed in: https://github.com/WordPress/wordpress-develop/pull/11567
Discussed in: https://core.trac.wordpress.org/ticket/31992
Props agulbra, akirk, benniledl, dmsnell, siliconforks.
See #65342.
}}}
--
Ticket URL: <https://core.trac.wordpress.org/ticket/65342#comment:3>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list