[wp-trac] [WordPress Trac] #63140: Unicode Chars (Icons) in the URL are possible, but break WordPress
WordPress Trac
noreply at wordpress.org
Fri Mar 21 08:56:03 UTC 2025
#63140: Unicode Chars (Icons) in the URL are possible, but break WordPress
--------------------------+------------------------------
Reporter: Stefan M. | Owner: (none)
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Permalinks | Version: 6.7.2
Severity: minor | Resolution:
Keywords: | Focuses:
--------------------------+------------------------------
Comment (by tusharaddweb):
Replying to [ticket:63140 Stefan M.]:
> My client used Unicode Chars (Icons) in the URL. WordPress doesnt seam
to filter them.
>
> So they where saved. Emediatly after, the page didnt work anymore. Even
back in draft, the page delivered a white page and not the page content.
> I did remove the icons. But page was still broken.
>
> Needed to move page content in a "new" page and save it to reenable it
again. Added icons to the URL and the same issue again.
>
> Why are Unicode Icons not filtered from the URL? Can you please apply a
filterin mechanism for only valid char in the url? Icons are not supposed
to be in the url I think.
In WordPress, Unicode characters (including icons and emojis) are not
automatically filtered from URLs (post slugs) because:
1. WordPress Allows Unicode in URLs for Internationalization
WordPress supports multilingual slugs to accommodate non-English
languages (e.g., Japanese, Arabic, Cyrillic).
Unicode is essential for SEO and accessibility in non-Latin character-
based languages.
2. No Built-in Restriction on Special Unicode Characters
While WordPress sanitizes URLs using sanitize_title(), it does not
explicitly remove all Unicode symbols, only certain special characters.
Some symbols might pass through if they don’t match WordPress’s
default filtering rules.
3. Some Unicode Characters Can Break URLs
Certain Unicode characters (like icons or control characters) may
cause issues with browsers, servers, or plugins.
If a theme or plugin doesn’t properly handle encoded URLs, it could
result in broken pages or white screens (as you experienced).
Solution: Apply a Custom Filter
you can restrict unwanted Unicode characters in slugs by adding this
custom function in functions.php
function filter_unicode_from_slug($slug) {
// Remove all non-alphanumeric characters except dashes and
underscores
$slug = preg_replace('/[^\p{L}\p{N}_-]+/u', '', $slug);
return sanitize_title($slug);
}
add_filter('sanitize_title', 'filter_unicode_from_slug', 10, 1);
This ensures that only valid letters, numbers, dashes, and underscores
remain in URLs. You can adjust the regex pattern to allow or disallow
specific characters as needed.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/63140#comment:2>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list