[wp-hackers] i18n URLs

Sam Angove sam at rephrase.net
Tue Oct 31 03:04:35 GMT 2006


On 10/31/06, Peter Westwood <peter.westwood at ftwr.co.uk> wrote:
>
> I thought true URI's in the RFC sense could only be 7bit

It's all about IRI's now: <http://tools.ietf.org/html/rfc3987>


Ryan Boren wrote:
> Should we remove the octet-encoding when sanitizing slugs?  We'd still
> need to support old, encoded slugs, but new slugs can forego the
> %cf%de%... stuff.  Do all of the major browsers support unencoded UTF-8
> characters? Thoughts?  Code?

Is this going to create problems with MySQL? (Especially if the goal
is still to support 3.23?) I haven't investigated; it's just a
thought.

It's easier to percent-encode any UTF-8 query vars when they come in,
and keep everything as a URI in the database. Isn't that what's
already being done? The permalink functions can always decode them
back to UTF-8 for display. Support both, in other words.

There are bound to be problems with people using non-UTF-8 charsets,
though, since they'd presumably need the %encoded form. At the very
least, the browser support isn't 100% in that situation. [1]

[1]: My copy of Firefox (1.5.0.7) fails most of these tests:
<http://www.w3.org/2001/08/iri-test/>


More information about the wp-hackers mailing list