[wp-trac] [WordPress Trac] #55117: Possible 5.9 Bug: Unknown character ( or %ef%bf%bc ) on content title
WordPress Trac
noreply at wordpress.org
Fri Jul 1 14:02:12 UTC 2022
#55117: Possible 5.9 Bug: Unknown character ( or %ef%bf%bc ) on content title
-------------------------------------------------+-------------------------
Reporter: cantuaria | Owner: audrasjb
Type: defect (bug) | Status: assigned
Priority: normal | Milestone: 6.1
Component: Permalinks | Version: 5.9
Severity: normal | Resolution:
Keywords: needs-patch has-testing-info has- | Focuses:
screenshots |
-------------------------------------------------+-------------------------
Comment (by dmsnell):
Thanks for the detailed reproducibility steps @ironprogrammer.
Unfortunately I think we need to track a different sequence of steps
because there's a difference between intentionally entering the object-
replacement character and the object-replacement character unexpectedly
appearing in a post title, which I believe is the real problem tracked in
this issue (but maybe I'm wrong).
So for all involved I think there's a conflation of a few different issues
here:
- Non-ASCII characters in a slug/URL are percent-encoded. This is
standard practice and "necessary" if we want to represent text people
enter. If my post is named "Bücher" the appropriate URL is "B%C3%BCcher".
There's another practice we don't use but could, which I think deserves
its own Trac ticket and eventually I would love to see us use - Punycode,
where the same "Bücher" slug would become "xn--bcher-kva" but in the
browser URL bar would appear at "Bücher".
- `[OBJ]` characters which are stored in the database are rendered on
page view. This is probably suspect enough that we should strip them out,
at least for the post title. It's debatable whether this is a problem with
WordPress or not because technically we could argue that if it's there in
the data it should be displayed (at least it has `print=yes` in its
Unicode properties).
- The `[OBJ]` character is appearing unintentionally in post titles which
generates the slugs which stand out because of the percent-encoding.
I'd like to address the third point in
[https://github.com/WordPress/gutenberg/issues/38637 #38637] if we can
since it's a Gutenberg bug. The first two are decisions more for Core and
maybe more appropriate for Trac. On that point I'm going to update that
issue with some findings that I found while working with @ironprogrammer
yesterday.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/55117#comment:26>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list