php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #63266 imap_utf8 may not convert to UTF-8
Submitted: 2012-10-12 04:43 UTC Modified: 2020-10-08 14:05 UTC
Votes:4
Avg. Score:4.0 ± 1.7
Reproduced:3 of 4 (75.0%)
Same Version:2 (66.7%)
Same OS:1 (33.3%)
From: Development at JivanAmara dot net Assigned: cmb (profile)
Status: Closed Package: IMAP related
PHP Version: Irrelevant OS: Linux (Ubuntu / Redhat)
Private report: No CVE-ID: None
 [2012-10-12 04:43 UTC] Development at JivanAmara dot net
Description:
------------
MIME 'from' header "=?gb18030?B?zfW+/L3c?=" is not converted to utf-8.  Instead 
the gb18030 encoding is returned.

I expect in versions < 5.4.0 a warning indicating unsupported character set and 
an empty string returned.

I expect in versions >= 5.4.0 a correctly converted utf-8 encoding of the text.

Behavior has been checked with versions 5.2.9, 5.3.?, 5.4.6


Test script:
---------------
<?php
$from = '=?gb18030?B?zfW+/L3c?=';

$from_imap_utf8 = imap_utf8($from);
echo "imap_utf8: $from_imap_utf8", "\n";

$from_obj = imap_mime_header_decode($from);
$charset = $from_obj[0]->charset;
$text = $from_obj[0]->text;
$text_utf8 = mb_convert_encoding($text, 'UTF-8', $charset);
echo "decode/convert: $text_utf8", "\n";

echo "Correct conversion: ", $from_imap_utf8 == $text_utf8 ? "Yes" : "No", "\n";
?>


Expected result:
----------------
For versions >= 5.4.0 I expect the script to output:
 imap_utf8: 王军杰
 decode/convert: 王军杰
 Correct conversion: Yes

For versions < 5.4.0 I expect the script to output:
 PHP Warning:  imap_utf8(): Illegal character encoding specified in <script name> 
on line 4
 imap_utf8: 
 PHP Warning:  mb_convert_encoding(): Illegal character encoding specified in 
<script name> on line 10
 decode/convert: 
 Correct conversion: Yes


Actual result:
--------------
For versions >= 5.4.0 I see:
 imap_utf8: �����
 decode/convert: 王军杰
 Correct conversion: No

For versions < 5.4.0 I see:
 imap_utf8: �����
 PHP Warning:  mb_convert_encoding(): Illegal character encoding specified in 
/home/netcloud/jivan.amara/encoding.php on line 20
 decode/convert: 
 Correct conversion: No


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-10-08 14:05 UTC] cmb@php.net
-Summary: imap_utf8 returns bad utf8 string +Summary: imap_utf8 may not convert to UTF-8 -Status: Open +Status: Verified -Type: Bug +Type: Documentation Problem -Assigned To: +Assigned To: cmb
 [2020-10-08 14:05 UTC] cmb@php.net
PHP's imap_utf8() is just a thin wrapper over libc-client's
utf8_mime2text() function.  If the given charset is no supported
by libc-client, utf8_mime2text does not fail, but returns the
string unmodified.  To me that looks like a bug in libc-client,
but that's out of scope for PHP anyway.

Thus, there is nothing we can do (besides re-implementing that
function based on mbstring or iconv, what is not going to happen).
Therefore I'm changing to doc problem.
 [2020-10-08 14:36 UTC] phpdocbot@php.net
Automatic comment on behalf of cmb
Revision: http://git.php.net/?p=doc/en.git;a=commit;h=67acb98daee3519f0dd843fe7835bcd636de931c
Log: Fix #63266: imap_utf8 may not convert to UTF-8
 [2020-10-08 14:36 UTC] phpdocbot@php.net
-Status: Verified +Status: Closed
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Nov 24 14:01:32 2024 UTC