[wp-hackers] non-ascii characters at URL and pasrsing those chars at string level

Jason LeVan jason at codeclarified.com
Tue Sep 9 23:53:54 UTC 2014

urldecode() mixed with remove_accents() perhaps?



Jason LeVan

Email: jason at codeclarified.com

Twitter: @codeclarified

On Tue, Sep 9, 2014 at 6:03 PM, Haluk Karamete <halukkaramete at gmail.com>

> First off, I need to get you what non-ascii chacters I'm talking about.
> For instance, just type in 'Slobodan Milosevic' in Google Search and go to
> the first suggested wikipedia link.
> You will see that the URL contains very unusual characters that is well
> beyond the common ASCII set. I'm simply curious if WordPress support that.
> Though this is not a feature I particularly like (to say the least), I do
> confess that I find it quite interesting from an HTTP point of view.
> But my real question (or pain to better put) is this.
> Say you are scraping that data and you came across that title with those
> funny characers...  and you want to create a tag out of that.
> Is there a conversion function that I can pass in that string and get back
> the ASCII 128 or below translated version?
> So I pass in 'slobodan_milo%c5%a1evi%c4%87', and I get back the good old
> 'Slobodan Milosevic'
> Does such a function exist? Or how do you deal with that situation?
> _______________________________________________
> wp-hackers mailing list
> wp-hackers at lists.automattic.com
> http://lists.automattic.com/mailman/listinfo/wp-hackers

More information about the wp-hackers mailing list