[wp-trac] [WordPress Trac] #63974: .mo file loaded as UTF-8 by default - non-standard and ignoring Content-Type headers
WordPress Trac
noreply at wordpress.org
Mon Sep 15 15:09:18 UTC 2025
#63974: .mo file loaded as UTF-8 by default - non-standard and ignoring Content-
Type headers
--------------------------+------------------------------
Reporter: kkmuffme | Owner: (none)
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: I18N | Version:
Severity: normal | Resolution:
Keywords: | Focuses:
--------------------------+------------------------------
Comment (by dmsnell):
If we want to avoid all security issues we can use the upcoming
`mb_scrub_utf8()` being prepared in #63863
> Good point! What do you suggest?
If it validates as UTF-8 it’s probably UTF-8, even if it reports another
encoding (at least this has been the case for the top 300,000 domains I
scanned on the Internet).
If it contains no header, we can call `mb_scrub_utf8()`. If it contains a
header and we can understand the encoding //and// validate it, then we can
convert.
Otherwise a `_doing_it_wrong()` sounds great, and we can `mb_scrub_utf8()`
the data to ensure it doesn’t introduce any invalid or malicious content.
----
It may be helpful to avoid the term “ANSI” in context of text encoding.
Most common encodings are US-ASCII compatible (bytes 0x00–0x7F all mean
the same thing) but all of the upper range (bytes 0x80–0xFF) are mutually
exclusive between the family of encoding commonly-referred to as “ANSI”
--
Ticket URL: <https://core.trac.wordpress.org/ticket/63974#comment:8>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list