[wp-trac] [WordPress Trac] #12956: © and ™ not stripped from sanitize_title

WordPress Trac wp-trac at lists.automattic.com
Sat Apr 10 14:43:13 UTC 2010


#12956: © and ™ not stripped from sanitize_title
--------------------------+-------------------------------------------------
 Reporter:  thomask       |       Owner:  ryan
     Type:  defect (bug)  |      Status:  new 
 Priority:  normal        |   Milestone:  3.0 
Component:  Permalinks    |     Version:  3.0 
 Severity:  normal        |    Keywords:      
--------------------------+-------------------------------------------------
 © and ™ (© ™) and probably many other symbols are not stripped
 out when sanitizing title.

 quick workaround:

 {{{
 function mc_sanitize_title( $title ) {
         return str_replace(array("%e2%84%a2", "%c2%a9"), "", $title);
 }
 add_filter('sanitize_title', 'mc_sanitize_title');
 }}}

 but i guess that there should be some most robust and standard solution,
 e.g. using PHP transliteration first and then removing anything but
 alphanumeric, coma and underscore like in this function


 {{{
 function friendly_url($raw_title) {
     $url = $raw_title;
     $url = preg_replace('~[^\\pL0-9_]+~u', '-', $url);
     $url = trim($url, "-");
     $url = iconv("utf-8", "us-ascii//TRANSLIT", $url);
     $url = strtolower($url);
     $url = preg_replace('~[^-a-z0-9_]+~', '', $url);
     return $url;
 }
 }}}

 this should also create nice url even for special alphabets. (setlocale
 must be set, but imho it is somewhere in wordpress core)

-- 
Ticket URL: <http://core.trac.wordpress.org/ticket/12956>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list