[wp-trac] [WordPress Trac] #64944: Generated Excerpts - Missing white space when stripping <br>s generated in paragraph block, verse block, etc.
WordPress Trac
noreply at wordpress.org
Wed Mar 25 23:22:00 UTC 2026
#64944: Generated Excerpts - Missing white space when stripping <br>s generated in
paragraph block, verse block, etc.
--------------------------------------+------------------------------
Reporter: addiestavlo | Owner: (none)
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Formatting | Version:
Severity: normal | Resolution:
Keywords: has-patch has-unit-tests | Focuses:
--------------------------------------+------------------------------
Changes (by sabernhardt):
* version: 6.9 =>
* component: General => Formatting
Old description:
> When using the verse block (or similarly paragraph block and "shft +
> enter" spacing), `<br>`s are added in the block's content between lines.
> When WordPress generates an excerpt (when no custom excerpt is set),
> these `<br>`s are stripped along with other html tags. This often creates
> excerpts with missing spaces between words.
>
> Consider the common poetry formatting where multiple lines exist in a
> paragraph block to represent a stanza.
>
> On the code side of the editor this looks like:
> ```
> <!-- wp:verse -->
> <pre class="wp-block-verse">this is<br>a verse block<br>it has<br>the
> same issues</pre>
> <!-- /wp:verse -->
> ```
> or
> ```<!-- wp:paragraph -->
> <p>This is a poem<br>using shft+space<br>Inside a paragraph block<br>for
> good stanza formatting</p>
> <!-- /wp:paragraph -->```
>
> When WP generates an excerpt based off the post content this ends up as:
> "this isa verse blockit hasthe same issues"
> or
> "This is a poemusing shift+spaceInside a paragraph blockfor good stanza
> formatting."
>
> This shows up often in excerpts generated from content corresponding to
> poety, song lyrics, or other similar formats. When excerpts are used in
> any context (post previews, email subject descriptions, etc.) these
> missing white spaces obviously look horrible.
>
> ### To Reproduce (recently tested in WP Playground on 6.9):
> * Create a new post using a paragraph of verse block. For the verse
> block, standard "enter" to add new lines will repro the issue. For the
> paragraph block, "shft + enter" to create new lines within the block.
> * Do not create a custom excerpt.
> * Publish the post.
> * Run get_the_excerpt for the post.
> * Verify that there are no spaces between the last words of one line and
> first words of the next.
>
> ### How to fix?
> I am uncertain on the best approach to resolve some notes:
>
> `wp_trim_excerpt` - calls get_the_content when no excerpt text is passed
> to it. Later calls `wp_trim_words`
>
> `wp_trim_words` - calls `wp_strip_all_tags` and later creates a
> `$words_array` using preg_split on the "/[\n\r\t ]+/" pattern.
>
> `wp_strip_all_tags` - strips all the tags in a preg_replace. Later, if
> $remove_breaks is true, replaces '/[\r\n\t ]+/' patterns with spaces. In
> the current chain in this context $remove_breaks is false so this doesn't
> happen here, and the preg_split noted above in `wp_trim_words` will find
> these.
>
> One thought, if `wp_strip_all_tags` similarly considered `<br>`s in the
> $remove_breaks block AND moved this handling before the preg_replace that
> strips tags, that seems like potentially a general improvement. If the
> goal is to replace breaks with spaces, then `<br>`s should be considered
> there. However, we don't call $remove breaks in our context coming from
> `wp_trim_words` and it may not make sense to add that there.
>
> Another thought, would it make sense for `wp_trim_words` to replace
> `<br>`s with spaces before calling `wp_strip_all_tags` ? Those spaces
> would then be caught by the pattern in the preg_split creating the
> $words_array.
>
> I am attaching a diff for the latter. `<br>` tags are stripped without
> preserving spacing, causing words to concatenate (e.g., ‘thisexample’).
> This replaces `<br>` with a space before tag stripping to preserve word
> boundaries.
New description:
When using the Verse block (or similarly Paragraph block and `shift +
enter` spacing), `<br>`s are added in the block's content between lines.
When WordPress generates an excerpt (when no custom excerpt is set), these
`<br>`s are stripped along with other HTML tags. This often creates
excerpts with missing spaces between words.
Consider the common poetry formatting where multiple lines exist in a
paragraph block to represent a stanza.
On the code side of the editor this looks like:
{{{
<!-- wp:verse -->
<pre class="wp-block-verse">this is<br>a verse block<br>it has<br>the same
issues</pre>
<!-- /wp:verse -->
}}}
or
{{{
<!-- wp:paragraph -->
<p>This is a poem<br>using shft+space<br>Inside a paragraph block<br>for
good stanza formatting</p>
<!-- /wp:paragraph -->
}}}
When WP generates an excerpt based off the post content this ends up as:
"this isa verse blockit hasthe same issues"
or
"This is a poemusing shift+spaceInside a paragraph blockfor good stanza
formatting."
This shows up often in excerpts generated from content corresponding to
poetry, song lyrics, or other similar formats. When excerpts are used in
any context (post previews, email subject descriptions, etc.) these
missing white spaces obviously look horrible.
=== To Reproduce (recently tested in WP Playground on 6.9):
* Create a new post using a Paragraph or Verse block. For the Verse block,
standard `enter` to add new lines will repro the issue. For the Paragraph
block, `shift + enter` to create new lines within the block.
* Do not create a custom excerpt.
* Publish the post.
* Run `get_the_excerpt` for the post.
* Verify that there are no spaces between the last words of one line and
first words of the next.
=== How to fix?
I am uncertain on the best approach to resolve some notes:
`wp_trim_excerpt` - calls get_the_content when no excerpt text is passed
to it. Later calls `wp_trim_words`
`wp_trim_words` - calls `wp_strip_all_tags` and later creates a
`$words_array` using `preg_split` on the `"/[\n\r\t ]+/"` pattern.
`wp_strip_all_tags` - strips all the tags in a `preg_replace`. Later, if
`$remove_breaks` is true, replaces `'/[\r\n\t ]+/'` patterns with spaces.
In the current chain in this context `$remove_breaks` is false so this
doesn't happen here, and the `preg_split` noted above in `wp_trim_words`
will find these.
One thought, if `wp_strip_all_tags` similarly considered `<br>`s in the
`$remove_breaks` block AND moved this handling before the `preg_replace`
that strips tags, that seems like potentially a general improvement. If
the goal is to replace breaks with spaces, then `<br>`s should be
considered there. However, we don't call `$remove_breaks` in our context
coming from `wp_trim_words` and it may not make sense to add that there.
Another thought, would it make sense for `wp_trim_words` to replace
`<br>`s with spaces before calling `wp_strip_all_tags` ? Those spaces
would then be caught by the pattern in the `preg_split` creating the
`$words_array`.
I am attaching a diff for the latter. `<br>` tags are stripped without
preserving spacing, causing words to concatenate (e.g., ‘thisexample’).
This replaces `<br>` with a space before tag stripping to preserve word
boundaries.
--
--
Ticket URL: <https://core.trac.wordpress.org/ticket/64944#comment:2>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list