[wp-trac] [WordPress Trac] #27733: wpautop(): \s in regex destroys some UTF-8 characters

WordPress Trac noreply at wordpress.org
Wed Jun 24 23:24:16 UTC 2015


#27733: wpautop(): \s in regex destroys some UTF-8 characters
-------------------------------------------------+-------------------------
 Reporter:  tenpura                              |       Owner:
     Type:  defect (bug)                         |      Status:  new
 Priority:  normal                               |   Milestone:  Future
Component:  Formatting                           |  Release
 Severity:  major                                |     Version:  0.71
 Keywords:  4.0-early needs-patch needs-unit-    |  Resolution:
  tests wpautop                                  |     Focuses:
-------------------------------------------------+-------------------------
Changes (by pavelxk):

 * severity:  normal => major


Comment:

 Happened to me on 4.2.2. with character Š (U+0160). This issue prevents
 further editing of the content/page because the editor does not load any
 content with invalid characters. Empty editor window is displayed without
 any errors. This means it is a major issue for me. It is also difficult to
 troubleshoot as it is locale specific.

 I would suggest to fix this by adding explicit UTF-8 pattern modifier for
 UTF-8 content.

 {{{
 if (mb_detect_encoding($pee, 'UTF-8', true) === 'UTF-8') {
   $pee = preg_replace('|(?<!<br />)\s*\n|u', "<br />\n", $pee);
 } else {
   $pee = preg_replace('|(?<!<br />)\s*\n|', "<br />\n", $pee);
 }
 }}}

--
Ticket URL: <https://core.trac.wordpress.org/ticket/27733#comment:11>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform


More information about the wp-trac mailing list