[wp-trac] [WordPress Trac] #18945: bad url character encoding in Arabic post names and categories

WordPress Trac wp-trac at lists.automattic.com
Wed Nov 16 08:49:30 UTC 2011


#18945: bad url character encoding in Arabic post names and categories
-------------------------------+------------------------------
 Reporter:  walid3             |       Owner:
     Type:  defect (bug)       |      Status:  new
 Priority:  normal             |   Milestone:  Awaiting Review
Component:  General            |     Version:  3.3
 Severity:  normal             |  Resolution:
 Keywords:  reporter-feedback  |
-------------------------------+------------------------------

Comment (by SergeyBiryukov):

 Replying to [comment:3 walid3]:
 > well, the link have this strange signs almost everywhere

 Encoding UTF-8 characters is a part of RFC 3986:

 > Non-ASCII characters must first be encoded according to UTF-8 [STD63],
 and then each octet of the corresponding UTF-8 sequence must be percent-
 encoded to be represented as URI characters.

 http://tools.ietf.org/html/rfc3986#page-21 [[BR]]
 http://en.wikipedia.org/wiki/Percent-encoding#Current_standard

 It's the same for Cyrillic characters, for example. I don't think we can
 do anything here.

 That said, most browsers decode the URLs to display them in a human-
 readable form:

 Firefox 8.0, Chrome 15, Opera 11.52, Safari 5.1 show unencoded URLs.
 [[BR]]
 IE 8, IE 9 show encoded URLs.

 See #16496 for making `$sample_permalink_html` human-readable.

 I've also checked comment feeds for posts with UTF-8 slugs, and they seem
 to work correctly.

-- 
Ticket URL: <http://core.trac.wordpress.org/ticket/18945#comment:4>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list