php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #78617 mb_decode_mimeheader does not follow RFC2047 correctly
Submitted: 2019-10-01 09:17 UTC Modified: 2019-10-02 18:01 UTC
Votes:1
Avg. Score:3.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: marcus at synchromedia dot co dot uk Assigned:
Status: Verified Package: mbstring related
PHP Version: 7.3.10 OS: any
Private report: No CVE-ID: None
 [2019-10-01 09:17 UTC] marcus at synchromedia dot co dot uk
Description:
------------
[RFC2047 section 4.2](https://tools.ietf.org/html/rfc2047#section-4.2) describes a way of encoding 8-bit characters sets in email headers, and in PHP that's handled by the mbstring extension. In that section, spaces can be encoded using either `=20` or `_`, the latter being preferable as it is more readable and uses fewer characters. A header encoded this way might look like this:

    X-My-Header: =?us-ascii?Q?hello_world?=

(this is a simplistic example - that header value does not actually *need* RFC2047 encoding, though it is harmless)

The mb_decode_mimeheader function does not decode this correctly, leaving the underscore undecoded. It does decode the alternative `=20` syntax correctly.

A workaround is to encode the `_` as `=20` prior to decoding, as in:

    mb_decode_mimeheader(str_replace('_', '=20', 'X-My-Header: =?us-ascii?Q?hello_world?='))

Note that this should not be applied blindly because the header may not be Q-encoded in the first place.

Test script:
---------------
echo mb_decode_mimeheader('X-My-Header: =?us-ascii?Q?hello_world?=');


Expected result:
----------------
X-My-Header: hello world

Actual result:
--------------
X-My-Header: hello_world

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-10-01 09:33 UTC] cmb@php.net
-Status: Open +Status: Verified -Package: Strings related +Package: mbstring related
 [2019-10-01 09:33 UTC] cmb@php.net
Confirmed: <https://3v4l.org/6TlQs>.
 [2019-10-02 16:52 UTC] cmb@php.net
The actual problem here is that MBString does not distinguish
between Quoted-Printable and Q encoding.
 [2019-10-02 18:01 UTC] marcus at synchromedia dot co dot uk
Seems a bit odd that it thinks it should use quoted printable directly in a header at all. Q encoding is a wrapper around quoted-printable - within the `=?charset?Q?...?=` container it's the same.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Oct 05 22:01:26 2024 UTC