php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #44570 mb_encode_mimeheader incorrectly folds iso-2022 text
Submitted: 2008-03-30 12:28 UTC Modified: 2008-07-31 16:24 UTC
From: tokul at users dot sourceforge dot net Assigned:
Status: Not a bug Package: mbstring related
PHP Version: 5.2CVS-2008-03-30 (snap) OS: Linux Debian Etch
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: tokul at users dot sourceforge dot net
New email:
PHP Version: OS:

 

 [2008-03-30 12:28 UTC] tokul at users dot sourceforge dot net
Description:
------------
mb_encode_mimeheader incorrectly folds iso-2022 texts. rfc 2047 chapter 3 second paragraph says that folded string must be reverted to ASCII. Code just splits ISO-2022 string without adding appropriate escapes.

Code works as expected only when mbstring.internal_encoding is set to correct value. This dependency of mb_encode_mimeheader() is not documented and catches all users that are not used to mb_encode_mimeheader() quirks.

Reproduce code:
---------------
/** See http://etext.lib.virginia.edu/japanese/hyakunin/frames/hyakuframes.html */
$string = "\xe7\xa7\x8b\xe3\x81\xae\xe7\x94\xb0\xe3\x81\xae\x20"
 ."\xe3\x81\x8b\xe3\x82\x8a\xe3\x81\xbb\xe3\x81\xae\xe5\xba\xb5"
 ."\xe3\x81\xae\x20\xe8\x8b\xab\xe3\x82\x92\xe3\x81\x82"
 ."\xe3\x82\x89\xe3\x81\xbf\x20\x20\xe3\x82\x8f\xe3\x81\x8c"
 ."\xe8\xa1\xa3\xe6\x89\x8b\xe3\x81\xaf\x20\xe9\x9c\xb2"
 ."\xe3\x81\xab\xe3\x81\xac\xe3\x82\x8c\xe3\x81\xa4\xe3\x81\xa4";

$string_iso2022 = mb_convert_encoding($string,'iso-2022-jp','utf-8');
echo mb_encode_mimeheader($string_iso2022,'iso-2022-jp','b');

Expected result:
----------------
=?ISO-2022-JP?B?GyRCPSkkTkVEJE4bKEIgGyRCJCskaiRbJE4wQyROGyhCIBskQkZRGyhC?=
=?ISO-2022-JP?B?GyRCJHIkIiRpJF8bKEIgIBskQiRvJCwwYTxqJE8bKEIgGyRCTyobKEI=?=
=?ISO-2022-JP?B?GyRCJEskTCRsJEQkRBsoQg==?=

Actual result:
--------------
=?ISO-2022-JP?B?GyRCPSkkTkVEJE4bKEIgGyRCJCskaiRbJE4wQyROGyhCIBskQkZRJHIk?=
 =?ISO-2022-JP?B?IiRpJF8bKEIgIBskQiRvJCwwYTxqJE8bKEIgGyRCTyokSyRMJGwkRCRE?=
 =?ISO-2022-JP?B?GyhC?=

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-03-30 12:45 UTC] tokul at users dot sourceforge dot net
If mbstring.internal_encoding is set, code does not need mb_convert_encoding() call.
 [2008-07-31 16:24 UTC] moriyoshi@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

That's exactly why mb_encode_mimeheader() converts the given string and encode it to MIME encoding at the same time. You may have been confused by the fact mb_encode_mimeheader() does not take an argument that specifies the encoding of the supplied string contrary to other mbstring functions. You should make sure that the internal encoding is set to the correct value prior to using it.

<?php
$string = "\xe7\xa7\x8b\xe3\x81\xae\xe7\x94\xb0\xe3\x81\xae\x20"
."\xe3\x81\x8b\xe3\x82\x8a\xe3\x81\xbb\xe3\x81\xae\xe5\xba\xb5"
."\xe3\x81\xae\x20\xe8\x8b\xab\xe3\x82\x92\xe3\x81\x82"
."\xe3\x82\x89\xe3\x81\xbf\x20\x20\xe3\x82\x8f\xe3\x81\x8c"
."\xe8\xa1\xa3\xe6\x89\x8b\xe3\x81\xaf\x20\xe9\x9c\xb2"
."\xe3\x81\xab\xe3\x81\xac\xe3\x82\x8c\xe3\x81\xa4\xe3\x81\xa4";
mb_internal_encoding("utf-8");
//$string = mb_convert_encoding($string,'iso-2022-jp','utf-8');
echo mb_encode_mimeheader($string,'iso-2022-jp','b');

?>

 
PHP Copyright © 2001-2022 The PHP Group
All rights reserved.
Last updated: Thu Dec 01 16:04:17 2022 UTC