[wp-trac] [WordPress Trac] #56530: Combining tilde passes `sanitize_title_with_dashes()` and so do most other diacritics

WordPress Trac noreply at wordpress.org
Thu Sep 8 01:14:04 UTC 2022


#56530: Combining tilde passes `sanitize_title_with_dashes()` and so do most other
diacritics
-------------------------------------------------+-------------------------
 Reporter:  anrghg                               |       Owner:  (none)
     Type:  defect (bug)                         |      Status:  new
 Priority:  normal                               |   Milestone:  Awaiting
                                                 |  Review
Component:  Formatting                           |     Version:
 Severity:  major                                |  Resolution:
 Keywords:  needs-dev-note needs-patch changes-  |     Focuses:
  requested                                      |
-------------------------------------------------+-------------------------

Comment (by anrghg):

 == Addenda

 `sanitize_title_with_dashes()` also removes some format controls, some
 quotation marks and other punctuation, some modifier letters, and some
 symbols, among which the percent sign.

 I think that here too, best is to do the full job by removing classes
 extensively **before** URL-encoding:
 {{{
 $title = preg_replace(
 '/[\p{Cf}\p{Ps}\p{Pe}\p{Pi}\p{Pf}\p{Po}\p{Sk}\p{So}\p{Lm}]/u', '', $title
 );
 }}}

 To avoid screwing up the percent sign removal, URL-decoding first seems to
 be the way to go. As by `urldecode()` the plus sign is converted to space,
 it could be maintained:
 {{{#!php
 <?php
 $title = str_replace( '%2B', '+', $title );
 $title = urldecode( $title );
 }}}

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/56530#comment:1>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list