[wp-trac] [WordPress Trac] #38044: Make seems_utf8() RFC 3629 compliant.

WordPress Trac noreply at wordpress.org
Wed Aug 13 18:41:28 UTC 2025


#38044: Make seems_utf8() RFC 3629 compliant.
--------------------------+-----------------------------
 Reporter:  gitlost       |       Owner:  dmsnell
     Type:  defect (bug)  |      Status:  reopened
 Priority:  normal        |   Milestone:  Future Release
Component:  Formatting    |     Version:  1.2.1
 Severity:  normal        |  Resolution:
 Keywords:  has-patch     |     Focuses:
--------------------------+-----------------------------

Comment (by dmsnell):

 > they run the function on several examples of both UTF-8 and non-UTF-8
 strings, and assert that the expected result is correct…should ideally be
 improved and expanded rather than removed altogether

 they definitely look that way, but “correct” is really vague here, and
 “improving” the tests may not be possible without applying new subjective
 constraints to it.

 for example, do we add a test to see if it properly detects invalid UTF-8?
 if we do that, we would force the function to validate UTF-8, which is
 absolutely does not do, and that could end up breaking code that currently
 depends on it //not validating// the byte stream.

 we could also provide inputs from other encodings and make sure it returns
 `false`, but then what if we pass in `¥100` or `ツゥ` in SHIFT-JIS? well,
 it’s going to return `true` and not `false` because the output is a valid
 UTF-8 byte stream. and so now I think it would imply that to build out the
 rejection-tests we have to sort through which improper encodings we want
 the function to remain wrong on, and enshrine those defects into the
 function.

 if we can uncover the intended behavior of the function I think it would
 be more viable to update the tests, but I haven’t been able to pinpoint
 any intention other than asking if we should attempt to convert a string
 //into// UTF-8, and at that purpose this isn’t adequate or sound even if
 well-tested.

 tough problem. thank you for raising this!

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/38044#comment:28>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list