[wp-trac] [WordPress Trac] #56531: Aiming to “kill” entities, `sanitize_title_with_dashes()` happens to eat content

WordPress Trac noreply at wordpress.org
Thu Sep 8 23:17:53 UTC 2022


#56531: Aiming to “kill” entities, `sanitize_title_with_dashes()` happens to eat
content
--------------------------+------------------------------
 Reporter:  anrghg        |       Owner:  (none)
     Type:  defect (bug)  |      Status:  new
 Priority:  normal        |   Milestone:  Awaiting Review
Component:  Formatting    |     Version:
 Severity:  major         |  Resolution:
 Keywords:                |     Focuses:
--------------------------+------------------------------

Comment (by anrghg):

 As I don’t have the resources to submit patches: The suggested rewrite
 could result in the following that also includes converting apostrophe to
 hyphen, and okina, letter apostrophe to underscore, as an important
 enhancement that would require a separate ticket and a dev note:
 {{{#!php
 <?php
 function sanitize_title_with_dashes( $title, $raw_title = '', $context =
 'display' ) {
         $title = strip_tags( $title );
         // Maintains plus sign before calling `urldecode()`.
         $title = str_replace( '%2B', '+', $title );
         // URL-decodes to avoid screwing up percent sign removal.
         $title = urldecode( $title );
         // Removes percent signs.
         $title = str_replace( '%', '', $title );
         // Decodes HTML entities.
         $title = html_entity_decode( $title );
         // Reencodes <, >, &.
         $title = htmlspecialchars( $title, ENT_NOQUOTES );
         // Converts to lowercase.
         if ( seems_utf8( $title ) && function_exists( 'mb_strtolower' ) )
 {
                 $title = mb_strtolower( $title, 'UTF-8' );
         }
         $title = strtolower( $title );

         if ( 'save' === $context ) {

                 // Converts okina, letter apostrophe to underscore.
                 $title = str_replace( array( 'ʻ', 'ʼ' ), '_', $title );
                 // Converts punctuation apostrophe to hyphen-minus.
                 $title = str_replace( array( '’', '\'' ), '-', $title );
                 // Converts spaces and dashes to hyphen-minus.
                 $title = preg_replace(
 '/[\p{Zs}\p{Zl}\p{Zp}\x{2010}-\x{2015}\x{2212}]/u', '-', $title );
                 // Converts &, @, /, * and dots to hyphen-minus.
                 $title = str_replace( array( '&', '@', '/', '*', '·',
 '‧' ), '-', $title );
                 // Converts times to 'x'.
                 $title = str_replace( '×', 'x', $title );
                 // Removes entirely format controls, punctuation, symbols,
 modifier letters.
                 $p_s_text = preg_replace(
 '/[\p{Cf}\p{Ps}\p{Pe}\p{Pi}\p{Pf}\p{Po}\p{Sk}\p{So}\p{Lm}]/u', '',
 $p_s_text );

         }

         // Converts period to hyphen-minus.
         $title = str_replace( '.', '-', $title );

         // Collapses and trims hyphen-minus.
         $title = preg_replace( '/-+/', '-', $title );
         $title = trim( $title, '-' );

         // Percent-encodes non-ASCII.
         if ( seems_utf8( $title ) ) {
                 $title = utf8_uri_encode( $title, 200 );
         }

         // Deletes unsafe ASCII. (No more space.)
         $title = preg_replace( '/[^%a-z0-9_-]/', '', $title );

         return $title;
 }
 }}}

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/56531#comment:5>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list