[wp-trac] [WordPress Trac] #10543: Incorrect (non-UTF-8) character handling in tag's name and slug

WordPress Trac wp-trac at lists.automattic.com
Tue Nov 17 20:38:39 UTC 2009


#10543: Incorrect (non-UTF-8) character handling in tag's name and slug
--------------------------+-------------------------------------------------
 Reporter:  sirzooro      |       Owner:  filosofo               
     Type:  defect (bug)  |      Status:  new                    
 Priority:  normal        |   Milestone:  2.9                    
Component:  Taxonomy      |     Version:  2.8.2                  
 Severity:  normal        |    Keywords:  has-patch needs-testing
--------------------------+-------------------------------------------------

Comment(by miqrogroove):

 sirzooro, my patch will truncate tags consistently so there is less
 confusion.  I don't think WordPress has a function to "drop invalid
 characters, instead of truncation".  What you are seeing in the slug is
 that WP tries to remove "accents" from certain characters for SEO reasons,
 uses a special function to URLencode the UTF-8 chars, and then deletes all
 remaining bytes.  This is not possible to do in the tag names, because
 then everyone would have spelling errors in their tags.  Once this patch
 is applied, your Polish software with mixed encodings will probably not
 work with tags until it is upgraded too.

-- 
Ticket URL: <http://core.trac.wordpress.org/ticket/10543#comment:15>
WordPress Trac <http://core.trac.wordpress.org/>
WordPress blogging software


More information about the wp-trac mailing list