[wp-trac] [WordPress Trac] #63063: IDN domains are erroneously URL-encoded in the wp_sanitize_redirect() function
WordPress Trac
noreply at wordpress.org
Wed Mar 5 19:11:52 UTC 2025
#63063: IDN domains are erroneously URL-encoded in the wp_sanitize_redirect()
function
-----------------------------+-----------------------------
Reporter: calpeconsulting | Owner: (none)
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Charset | Version: 6.7.2
Severity: minor | Keywords: needs-patch
Focuses: |
-----------------------------+-----------------------------
== Overview ==
There is an issue with how Internationalised Domain Names (IDNs) are
handled in the WordPress redirect system, specifically when an IDN is used
in the "WordPress Address" or "Site Address" settings. The problem occurs
when WordPress tries to redirect the user back to the post after
submitting a comment. The domain part of the URL, which should remain in
its IDN format, is incorrectly processed and URL-encoded by WordPress.
== WordPress and IDNs ==
WordPress fully supports IDN domains in the General site settings (under
"WordPress Address (URL)" and "Site Address (URL)"). These fields allow
users to set an IDN domain (such as simon.schönbeck.dk) for their website
without any issues.
The IDN domain should not undergo any transformation when used in URLs
within WordPress, as the domain is already properly handled and encoded
when set in the site's settings.
== Redirection Process ==
When a comment is posted, WordPress triggers a redirect to the comment's
location on the post. This is done using the $location variable, which
contains the full URL (including the post's IDN domain).
== The Problem ==
During this process, the function wp_sanitize_redirect() is called. This
function is responsible for sanitising and cleaning up the redirect URL.
=== Unexpected Behaviour ===
The wp_sanitize_redirect() function calls _wp_sanitize_utf8_in_redirect().
This function URL-encodes any UTF-8 characters in the URL, which includes
characters in the domain name (e.g., the ö in simon.schönbeck.dk is
encoded as %C3%B6).
This transformation should not occur for IDN domains, as the domain part
is already in a valid format (IDN is treated differently from regular
UTF-8 encoding).
== Effect ==
The problem arises because the sanitisation process applies URL encoding
to the domain part of the IDN URL, such as converting characters like ö to
%C3%B6. This encoding breaks the validation of the domain name in
wp_validate_redirect(), which expects the domain to be in a valid, non-
encoded format.
Since the domain with URL-encoded characters does not pass the validation
checks, the fallback URL is triggered. By default, this fallback URL is
set to the WordPress admin page (admin.php), resulting in the user being
incorrectly redirected to the admin dashboard rather than back to the post
they came from.
This issue mainly affects guest commentators who do not need to log in
before commenting. After the comment is successfully submitted, but due to
the failed URL validation, WordPress redirects them to the admin panel as
a fallback URL. Since they are not logged in, they are then redirected to
the login page, even though no login is required to post a comment.
== Solution ==
The IDN domain part of the URL should not be sanitised or URL-encoded for
UTF-8 characters, as it is already in a valid format. The sanitisation
process should respect the IDN format, preventing unnecessary
transformations that break the validation.
== Workaround ==
Until a fix is implemented, a workaround is to manually encode your
site/blog URL as Punycode in the "WordPress Address (URL)" and "Site
Address (URL)" settings. This ensures that the domain part is in the
correct format and avoids the encoding issues caused by the sanitisation
process.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/63063>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list