[wp-trac] [WordPress Trac] #29717: wp_check_invalid_utf8 - pcre tricks and failsafes, +mb_convert_encoding, iconv fix, performance

WordPress Trac noreply at wordpress.org
Sun Sep 21 19:37:30 UTC 2014


#29717: wp_check_invalid_utf8 - pcre tricks and failsafes, +mb_convert_encoding,
iconv fix, performance
--------------------------------+------------------------------------------
 Reporter:  askapache           |       Owner:
     Type:  enhancement         |      Status:  new
 Priority:  normal              |   Milestone:  Awaiting Review
Component:  Formatting          |     Version:  trunk
 Severity:  normal              |  Resolution:
 Keywords:  has-patch dev-      |     Focuses:  administration, performance
  feedback                      |
--------------------------------+------------------------------------------

Comment (by askapache):

 Replying to [comment:6 kitchin]:
 > Cool stuff. Comments:
 >
 > (1) I still think the old blog_charset check is clearest. No need to
 confuse people into having to look up obscure docs. Old code:
 > {{{
 > in_array( get_option( 'blog_charset' ), array( 'utf8', 'utf-8', 'UTF8',
 'UTF-8' ) )
 > }}}
 > vs. your new code
 > {{{
 > stripos( $is_utf8, 'utf' ) !== false && strpos( $is_utf8, '8' ) !==
 false
 > }}}
 >
 > (2) The WP code base never checks the result of ini_set() or @ini_set()
 but in this case it seems wise to do so. Hosts can disallow it. Most
 robust way is probably:
 > {{{
 > static $mb_convert;
 > if ( function_exists( 'mb_convert_encoding' ) ) {
 >   @ini_set( 'mbstring.substitute_character', 'none' );
 >   $mb_convert = @ini_get( 'mbstring.substitute_character' ) === 'none';
 > }
 > }}}
 >
 > I don't imagine anybody is worried about changing that ini value with
 restoring it, but it should probably be noted in the inline doc as a side
 effect.
 >
 > As for WP coding standards nits, WP wants braces on all clauses (if ...
 {}). Also, no parentheses around function_exists() at line 775.


 As the blog_charset check does only run once, I agree that the old code is
 better.  I also added your suggestion to verify that the ini value is
 correctly set to 'none' as part of the requirement for using
 mb_convert_encoding if iconv is unavailable.

 I also went ahead and added braces, and removed the parentheses from the
 function_exists statement, nice one.

--
Ticket URL: <https://core.trac.wordpress.org/ticket/29717#comment:7>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list