[wp-trac] [WordPress Trac] #57207: Consider adding the Unicode regex flag in wp_check_comment_disallowed_list

WordPress Trac noreply at wordpress.org
Mon Nov 28 16:47:35 UTC 2022


#57207: Consider adding the Unicode regex flag in wp_check_comment_disallowed_list
-------------------------+------------------------------
 Reporter:  bonjour52    |       Owner:  (none)
     Type:  enhancement  |      Status:  new
 Priority:  normal       |   Milestone:  Awaiting Review
Component:  Comments     |     Version:
 Severity:  normal       |  Resolution:
 Keywords:               |     Focuses:
-------------------------+------------------------------

Comment (by bonjour52):

 I found official documentation on the purpose of the Unicode regex flag
 ("u") for case-insensitive matching. Not the PHP documentation, which is
 extremely brief on the subject of regex flags, which it calls "PCRE
 modifiers". But in the "Perl-compatible Regular Expressions (PCRE)"
 documentation:


 {{{
 https://pcre.org/pcre.txt
 }}}


 Here is what this documentation says:

 **"If you want to use caseless matching for characters 128 and above, you
 must ensure that PCRE is compiled with Unicode property support as well as
 with UTF-8 support."**

 and also:

 **"Case-insensitive matching applies only to characters whose values are
 less than 128, unless PCRE is built with Unicode property support."**

 This means that line:


 {{{
 $pattern = "#$word#i";
 }}}


 works only for ASCII characters (characters whose values are less than
 128), while:


 {{{
 $pattern = "#$word#iu";
 }}}


 works for all characters in general.

-- 
Ticket URL: <https://core.trac.wordpress.org/ticket/57207#comment:1>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list