|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #49272 Incorrect encoding in structured header field bodies
Submitted: 2009-08-16 08:36 UTC Modified: 2014-07-15 11:17 UTC
From: u235e at hotmail dot com Assigned: yohgaki (profile)
Status: Closed Package: mbstring related
PHP Version: 5.2.10 OS: *
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Bug Type:
From: u235e at hotmail dot com
New email:
PHP Version: OS:


 [2009-08-16 08:36 UTC] u235e at hotmail dot com
Function: mb_encode_mimeheader

When trying to construct a structured header field like From or To, only _words_ within phrases or ctext in comments may be encoded, especially not within "quoted strings" as of RFC 2047 section 5.
This function does not take that into account, even worse it may make the field invalid as it greedily encodes everything after the first encountered WSP delimited text with non ASCII characters.

This function is only useful for unstructured header fields as it is and thus only for the subject field in most common cases.
Judging from the manual and the given example, I take it that this is not the only intended use.

The example below would also apply to (comments) not just "quoted strings". Technically the quotes should probably not be part of the encoded word ("lexically invisible") where as the () in comments must stay in place to still recognize the text as a comment.

And as a side note: I dont understand why digits are encoded and spaces not as underscores in "Q" scheme - renders the point of this scheme useless?

Reproduce code:
$name = "Peter \"Der M\xFCller\""; // German - Peter <">Der M?ller<">
// valid RFC 2822 display-name
$mbox = "peter.mueller";
$doma = "";
$addr = mb_encode_mimeheader($name, "ISO-8859-1", "Q") . " <" . $mbox . "@" . $doma . ">";
echo $addr;

Expected result:
should be:
Peter =?ISO-8859-1?Q?=22Der=20M=FCller=22?= <>
or very greedy:
=?ISO-8859-1?Q?Peter=20=22Der=20M=FCller=22?= <>
or maybe even with the quotes stripped:
Peter =?ISO-8859-1?Q?Der=20M=FCller?= <>

Actual result:
Peter "Der =?ISO-8859-1?Q?M=FCller=22?= <>

Which is not a valid RFC 2822 name-addr (display-name) any more!


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2009-08-16 08:45 UTC] u235e at hotmail dot com
Disregard the parts about stripping quotes.
 [2009-08-16 08:53 UTC] u235e at hotmail dot com
Oh, and the display-name in the example is technically not valid because of the non ASCII char of course - but that's the point of this function, right? ;-)
 [2010-02-22 06:39 UTC]
mb_encode_mimeheader() is supposed to be used to encode such words or the header itself that is known to contain only words, not to encode the entire header.  You need to parse it manually beforehand.
 [2014-07-15 11:17 UTC]
-Status: Open +Status: Closed -Assigned To: +Assigned To: yohgaki
 [2014-07-15 11:17 UTC]
Cannot see the report text. Probably due to malformed encoding data. Please report new bug, if you have issue still.
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun May 19 00:01:33 2024 UTC