[wp-trac] [WordPress Trac] #38044: Make seems_utf8() RFC 3629 compliant.
WordPress Trac
noreply at wordpress.org
Wed Aug 13 18:41:28 UTC 2025
#38044: Make seems_utf8() RFC 3629 compliant.
--------------------------+-----------------------------
Reporter: gitlost | Owner: dmsnell
Type: defect (bug) | Status: reopened
Priority: normal | Milestone: Future Release
Component: Formatting | Version: 1.2.1
Severity: normal | Resolution:
Keywords: has-patch | Focuses:
--------------------------+-----------------------------
Comment (by dmsnell):
> they run the function on several examples of both UTF-8 and non-UTF-8
strings, and assert that the expected result is correct…should ideally be
improved and expanded rather than removed altogether
they definitely look that way, but “correct” is really vague here, and
“improving” the tests may not be possible without applying new subjective
constraints to it.
for example, do we add a test to see if it properly detects invalid UTF-8?
if we do that, we would force the function to validate UTF-8, which is
absolutely does not do, and that could end up breaking code that currently
depends on it //not validating// the byte stream.
we could also provide inputs from other encodings and make sure it returns
`false`, but then what if we pass in `¥100` or `ツゥ` in SHIFT-JIS? well,
it’s going to return `true` and not `false` because the output is a valid
UTF-8 byte stream. and so now I think it would imply that to build out the
rejection-tests we have to sort through which improper encodings we want
the function to remain wrong on, and enshrine those defects into the
function.
if we can uncover the intended behavior of the function I think it would
be more viable to update the tests, but I haven’t been able to pinpoint
any intention other than asking if we should attempt to convert a string
//into// UTF-8, and at that purpose this isn’t adequate or sound even if
well-tested.
tough problem. thank you for raising this!
--
Ticket URL: <https://core.trac.wordpress.org/ticket/38044#comment:28>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list