[wp-trac] [WordPress Trac] #62940: wp_mail(): Address header parsing is not RFC-5322 complient and fails on quoted-string when including a "<", ">" or ", "
WordPress Trac
noreply at wordpress.org
Tue Feb 11 05:41:21 UTC 2025
#62940: wp_mail(): Address header parsing is not RFC-5322 complient and fails on
quoted-string when including a "<", ">" or ","
--------------------------+-----------------------------
Reporter: bhujagendra | Owner: (none)
Type: defect (bug) | Status: new
Priority: normal | Milestone: Awaiting Review
Component: Mail | Version: trunk
Severity: normal | Keywords:
Focuses: |
--------------------------+-----------------------------
== Problem
The following `From` field values result in an error, although both comply
with RFC 2822:
==== **Case 1: quoted printable:**
- Input: {{{From: "test<stage" <test at example.com>}}}
- Result:
- **$from_name**: `"test`
- **$from_email**: `stage" <test at example.com`
==== **Case 2: multiple from addresses:**
- Input: {{{From: test <test at example.com>, User <user at example.com>}}}
- Result:
- **$from_name**: `test`
- **$from_email**: `test at example.com, User <user at example.com`
==== **Case 3: similar problem for `to`, `cc`, `bcc`, and `reply_to`
fields! **
- Input: {{{To: test <test at example.com>, "Joe, Doe"
<joe.doe at example.com>}}}
- Result:
- **address 1**: `test <test at example.com>`
- **address 2**: `"Joe`
- **address 3**: `Doe" <joe.doe at example.com>`
== Reason
The [https://www.php.net/manual/en/function.mail.php php mail()
documentation] does not explicitly state that the `From` header needs to
comply with RFC 2822. But it does say so for the `To` header. Furthermore,
the current implementation does already support the `name-
addr`-[https://datatracker.ietf.org/doc/html/rfc2822#section-3.4 syntax]
(i.e. `[display-name] angle-addr`), which was not allowed in the preceding
[https://datatracker.ietf.org/doc/html/rfc822#section-4.4.1 RFC 822].
However, `wp_mail` does ignore the facts
- that `display-name` may include `<` (case !#1 above), and
- that the From field may actually include multiple addresses.
==== RFC 2822 Specification
- `From` field value is `mailbox-list`
[https://datatracker.ietf.org/doc/html/rfc2822#section-3.6.2 section
3.6.2]
- `mailbox-list` is a single `mailbox`, or two or more comma-separated
`mailbox, mailbox`
[https://datatracker.ietf.org/doc/html/rfc2822#section-3.4 section 3.4]
- `mailbox` is either just an `addr-spec` (e-mail address) or a `name-
addr` [https://datatracker.ietf.org/doc/html/rfc2822#section-3.4 section
3.4]
- `display-name` is a `phrase`
[https://datatracker.ietf.org/doc/html/rfc2822#section-3.4 section 3.4]
- `phrase` is a `word` and hence a `quoted-string`
[https://datatracker.ietf.org/doc/html/rfc2822#section-3.2.6 section
3.2.6]
- `quoted-string`
[https://datatracker.ietf.org/doc/html/rfc2822#section-3.2.5 (i.e. a
string in double quotes)] may include any US-ASCII prinatable character
except for the double quote (`"`) and the backslash (`\`). And as such
also the `<`.
==== Current Implementation
The parsing of the field simply splits the two parts of the `name-addr` at
the position of the first occurrence of `<`, see line 2 of the code in
question [https://github.com/WordPress/wordpress-
develop/blob/d71f29ff0899c6eb29bc037a7c521a966405cb35/src/wp-
includes/pluggable.php#L301-L319 (pluggable.php:301-319)]:
{{{#!php
<?php
case 'from':
$bracket_pos = strpos( $content, '<' );
if ( false !== $bracket_pos ) {
// Text before the bracketed email is the "From" name.
if ( $bracket_pos > 0 ) {
$from_name = substr( $content, 0, $bracket_pos );
$from_name = str_replace( '"', '', $from_name );
$from_name = trim( $from_name );
}
$from_email = substr( $content, $bracket_pos + 1 );
$from_email = str_replace( '>', '', $from_email );
$from_email = trim( $from_email );
// Avoid setting an empty $from_email.
} elseif ( '' !== trim( $content ) ) {
$from_email = trim( $content );
}
break;
}}}
Similarly, for the `to`, `cc`, `bcc`, and `reply_to` fields:
- [https://github.com/WordPress/wordpress-
develop/blob/d71f29ff0899c6eb29bc037a7c521a966405cb35/src/wp-
includes/pluggable.php#L336-L344 Splitting the field value at every comma]
(which may be quoted)
- [https://github.com/WordPress/wordpress-
develop/blob/d71f29ff0899c6eb29bc037a7c521a966405cb35/src/wp-
includes/pluggable.php#L445-L456 Ignoring possible quotes when extracting
name and address from the different values]
== Discussion
Unfortunately there is no simple solution:
- Correct parsing - according to the standard outlined in RFC 2822 - would
probably require the use of a real parser (as opposed to searching the
entire string for certain tokens).
- `PHPMailer::parseAddresses()` [https://github.com/WordPress/wordpress-
develop/blob/d71f29ff0899c6eb29bc037a7c521a966405cb35/src/wp-
includes/PHPMailer/PHPMailer.php#L1222 method] cannot be used, as it lacks
the same shortcoming, if the IMAP extension is not installed.
- [https://github.com/zbateson/mail-mime-parser/ mail-mime-parser] could
work as an alternative. But it's a whole other level of complexity and
only supports php v8+
- Even if correct parsing would be supported and `$from_name` as well as
`$from_email` correctly assigned, this would only work for
single-`mailbox` entries. Multiple from addresses are not supported by
- the the `wp_mail_from`-[https://github.com/WordPress/wordpress-
develop/blob/d71f29ff0899c6eb29bc037a7c521a966405cb35/src/wp-
includes/pluggable.php#L397 filter] and
`wp_mail_from_name`-[https://github.com/WordPress/wordpress-
develop/blob/d71f29ff0899c6eb29bc037a7c521a966405cb35/src/wp-
includes/pluggable.php#L406 filter]
- the `PHPMailer\PHPMailer\PHPMailer`'s `createHeader`
[https://github.com/WordPress/wordpress-
develop/blob/d71f29ff0899c6eb29bc037a7c521a966405cb35/src/wp-
includes/PHPMailer/PHPMailer.php#L2683 method]
=== Weak implementation
It must be noted, that the current implementation
[https://core.trac.wordpress.org/search?changeset=on&ticket=on&q=wp_mail+%3C&page=6&noquickjump=1
has repeatedly had bugs and regressions] (although it has been quiet
around this in the last years):
- #23243
- #19847
- #18463
- #17305
- #1593
=== Pragmatic proposal
- Introduce a new filter `wp_mail_from_raw` at the beginning of the `From`
field parsing:
{{{#!php
<?php
/**
* Filters the raw "From" header field before parsing it.
*
* @since 7.6.2
*
* @param string $content Form header content.
* @param string $header Full raw header.
*/
$content = apply_filters( 'wp_mail_from_raw', $content, $header );
}}}
This would allow a user (or plugin author) to mitigate the issue, if it
happens on a given site. But it would save them the effort of parsing the
headers entirely (as would be required with `wp_mail`).
- Alternatively, the filter could be called for all headers in
[https://github.com/WordPress/wordpress-develop/blob/trunk/src/wp-
includes/pluggable.php#L300 line 300]
{{{#!php
<?php
$origName = $name; // used later for the default case in line 347
$name = strtolower( $name );
/**
* Filters the raw header fields before parsing them.
*
* @since 7.6.2
*
* @param string $content Header field's content.
* @param string $name Header field's name.
* @param string $header Full raw header.
*/
$content = apply_filters( 'wp_mail_raw_header', $content, $name, $header
);
/**
* Filters the raw header fields before parsing them.
*
* @since 7.6.2
*
* @param string $content Header field's content.
* @param string $name Header field's name.
* @param string $header Full raw header.
*/
$content = apply_filters( "wp_mail_header_$name", $content, $header );
// ...
// Line 347 needs to be updated:
// $headers[ trim( $name ) ] = trim( $content );
$headers[ $origName ] = $content; // both values are already trimmed in
lines 296 & 297
}}}
- Potentially use a RFC-compliant parser to avoid future errors with
quoted address names at least in the other address fields
- Use an instance of an object implementing `ArrayAccess` and
`Stringable`. This can be used to store one or a set of addresses/names in
the `$from_email`/`wp_mail_from` and `$from_name`/`wp_mail_from_name`
values and is sent through the respective filters. By this it is basically
backwards compatible.
- Parse the `Sender` header, add a filter and default it to the first
address of a multi-address From field of the default wordpress from
address.
- Enhance the `PHPMailer::setFrom()` method to accept `$address` and
`$name` to be arrays. And therefore also the corresponting class fields
`$From` and `$FromName` (maybe through a PR to
[https://github.com/PHPMailer/PHPMailer/ PHPMailer])
Happy to help with a PR. I would just love to know what would be accepted
in there.
Thanks,
Bhujagendra
--
Ticket URL: <https://core.trac.wordpress.org/ticket/62940>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list