php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #78871 mailparse_rfc822_parse_addresses incorrectly parsing a specific payload
Submitted: 2019-11-27 00:54 UTC Modified: 2020-01-26 13:58 UTC
From: ndavison85 at gmail dot com Assigned: remi (profile)
Status: Assigned Package: mailparse (PECL)
PHP Version: 7.3.12 OS: Ubuntu
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2019-11-27 00:54 UTC] ndavison85 at gmail dot com
Description:
------------
The PHP function mailparse_rfc822_parse_addresses is parsing the payload "<aaa@bbb.com>xxx@yyy.com" as a valid email address string, returning both "aaa@bbb.com" and "xxx@yyy.com" as valid addresses. I do not believe this is consistent with the RFC.

The reason I have flagged this as a security issue is because I believe this can lead to security flaws in PHP apps where by the app logic may expect a single address in the string (perhaps because the code is first stripping out commas and semi colons from the string), and then proceed to use the output of mailparse_rfc822_parse_addresses on the string to determine the email address, which may not be consistent with the way the backend email system handles the payload.

For example, I'm aware of one popular emailing service that will accept this payload and send the message to "aaa@bbb.com", but if the PHP code handles the output of mailparse_rfc822_parse_addresses and takes the last Array element from its return and uses that to determine which domain the email is being sent to, it will consider the email to be "xxx@yyy.com". In apps that might grant privileges or make access control decisions based on user's email domain, this could be a significant vulnerability.

My research across various languages and their standard ways to parse email addresses leaves mailparse_rfc822_parse_addresses as the only technique which returns a valid value (the rest either return null or raise an error when given this payload). This includes the PEAR Mail_RFC822::parseAddressList(), which returns a validation failure when given this payload.

Test script:
---------------
<?php

$addrs = mailparse_rfc822_parse_addresses("<aaa@bbb.com>xxx@yyy.com");

if ($addrs[0]['address'] === 'aaa@bbb.com') {
	echo "contains aaa@bbb.com\n";
}
if ($addrs[1]['address'] === 'xxx@yyy.com') {
	echo "contains xxx@yyy.com\n";
}

Expected result:
----------------
I would expect the return to either be empty, null, false etc, or return a single Array element with "aaa@bbb.com" as the valid address (although, I still don't believe the latter is consistent with the RFC).

Actual result:
--------------
An Array with two elements, containing both "aaa@bbb.com" and "xxx@yyy.com" as valid email addresses.

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-11-27 00:59 UTC] ndavison85 at gmail dot com
For more detail on the research which lead me to find this issue, please see:

https://gist.github.com/ndavison/a495de6dce836dcb035f751f8e340599

This is a secret Gist of a blog post I intend to publish at some point. This Gist URL is also currently in the possession of AWS Security (in regards to the fact SES processes the payload as a valid email address) - please do not share this URL outside of the immediate PHP security group (I suspect I didn't need to say this, but just to be sure :)).
 [2019-11-27 11:13 UTC] cmb@php.net
-Package: Mail related +Package: mailparse
 [2019-11-27 11:13 UTC] cmb@php.net
<aaa@bbb.com> is interpreted as route-addr, a concept that is well
defined in RFC 822 (but already discouraged there, and dropped
from RFC 2822).  It is not quite valid as route-addr, because at
least one phrase would have to come before the "<", but since
mailparse's parsing is liberal, that's not a bug per se.

What is rather suprising to me, though, is that such route-addrs
are added as addresses without any further indication (e.g. by
having another boolean field "is_route_address" or such).
 [2019-11-27 22:32 UTC] ndavison85 at gmail dot com
Another potential security impact of how the payload is parsed by mailparse_rfc822_parse_addresses is what I touched on regarding the absence of commas - after confirming the string doesn't contain a comma, a developer may pass the returning Array directly to a mail API which accepts an Array for the 'to' addresses thinking there can only be one address, however the response from mailparse_rfc822_parse_addresses contains two when given this payload.

I may be missing something in the RFC(s) when making the assumption that no comma means it's safe to assume only one address is present.
 [2019-11-30 21:58 UTC] stas@php.net
Doesn't look like security issue. mailparse_rfc822_parse_addresses() is clearly documented as returning multiple addresses, its return type is array. Of course, for any bug, a scenario can be contrived where security function could depend on this bug not being present, but this alone is not enough to classify every bug as security issue.
 [2019-11-30 21:59 UTC] stas@php.net
-Assigned To: +Assigned To: remi
 [2019-11-30 22:29 UTC] ndavison85 at gmail dot com
I agree security scenarios can be contrived for basically any bug, but in this case I believe any instance where a developer sends a string into an email function they reasonably ensured is only 1 address, and it spits out 2, that could be particularly problematic for an app's security posture (as email is often linked to sensitive information or identity/auth(z) functionality).
 [2019-12-01 04:30 UTC] ndavison85 at gmail dot com
This is a bit unrelated to the original bug (similar threat, different payload), but I've also noticed that mailparse_rfc822_parse_addresses will terminate at a null byte (without error) which could allow an attacker to get an email on one domain but the rest of the PHP code may treat the string to be another domain. Snippet for this:

$ php -r 'print_r(mailparse_rfc822_parse_addresses("aaa@bbb.com\x00.ddd.com"));'
Array
(
    [0] => Array
        (
            [display] => aaa@bbb.com
            [address] => aaa@bbb.com
            [is_group] => 
        )

)

This wouldn't really be a problem if the output of mailparse_rfc822_parse_addresses is used for both determining the address for app logic and for the mail server, but if for some reason this was only used for one of those, then it could result in vulnerabilities.

Should I open a new bug for this?
 [2019-12-01 04:45 UTC] stas@php.net
That's definitely a separate issue, though I am not sure whether it's a bug... mailparse_rfc822_parse_addresses() does not promise you to validate strings as email addresses. So if you're using it in security-sensitive context as validator, you are probably doing it wrong. You should use whitelist filter instead. Probably something like FILTER_VALIDATE_EMAIL.
 [2020-01-26 02:12 UTC] ndavison85 at gmail dot com
Hi there,

I was looking to publish the writeup I linked to earlier which references this bug. I note that some comments here suggest this is not being treated as a security issue - does this indicate that you would be okay for me to publish the writeup?

I have confirmed from AWS that they are not treating the parsing of the <x@y.com>a@b.com payload as a valid address as a security issue so there is no need for this to remain private from my perspective.
 [2020-01-26 13:58 UTC] cmb@php.net
-Type: Security +Type: Bug
 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Thu Feb 20 09:01:25 2020 UTC