PHP :: Bug #15630 :: imap_utf7_decode appears to be broken

Bug #15630

imap_utf7_decode appears to be broken

Submitted:

2002-02-19 15:56 UTC

Modified:

2003-02-04 16:45 UTC

Votes:	15
Avg. Score:	4.6 ± 0.7
Reproduced:	12 of 13 (92.3%)
Same Version:	2 (16.7%)
Same OS:	2 (16.7%)

From:

robert dot marchand at umontreal dot ca

Assigned:

Status:

No Feedback

Package:

IMAP related

PHP Version:

4.2.2

OS:

SGI Irix 6.5

Private report:

CVE-ID:

None

View Developer Edit

Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.

php.net Username: php.net Password:

Quick Fix:	(description)
	Block user comment
Status:		Assign to:
Package:
Bug Type:
Summary:
From:	robert dot marchand at umontreal dot ca
New email:
PHP Version:		OS:

New/Additional Comment:

[2002-02-19 15:56 UTC] robert dot marchand at umontreal dot ca

Hi,
  
   when trying to use the IMP Webmail client with an Exchange 2000 Server, folders
with accents in the name are mangled.  It appears that the Exchange Imapd server
sends modified utf-7 names as explained in rfc 2060.  IMP use a call to the
imap_utf7_decode function.  It should work but it does'nt.

Here's a sample:
What Exchange send: "&AMk-l&AOk-ments envoy&AOk-s"
What it means: ?l?ments envoy?s.
What IMP show: <nothing>

My setting is:
SGI Irix 6.5
Imap-2001a
PHP 4.1.1
IMP 2.2.7

I have setup a small script that show the problem:
<?php

error_reporting(63);

$folder = '&AMk-l&AOk-ments envoy&AOk-s';
$plain = 'Bo?te de r?ception';
$unicode = mb_convert_encoding($plain, "UNICODE", "ISO-8859-1");
$br = "<br>";

echo "<html><head><title>test UTF7</title></head><body>";

echo "folder (modified UTF-7): ", $folder, $br;
echo "plain (Latin1): ", $plain, $br, $br;

echo "<strong>mb_convert_encoding test</strong>", $br;
$test = mb_convert_encoding($folder, "auto", "UTF7-IMAP");
echo "  folder decoded: ", $test, $br;
$test = mb_convert_encoding($test, "UTF7-IMAP", "ISO-8859-1");
echo "  encoded again: ", $test, $br;
$test = mb_convert_encoding($test, "auto", "UTF7-IMAP");
echo "  decoded again: ", $test, $br, $br;

$test = mb_convert_encoding($plain, "UTF7-IMAP", "ISO-8859-1");
echo "  plain encoded: ", $test, $br;
$test = mb_convert_encoding($test, "auto", "UTF7-IMAP");
echo "  decoded: ", $test, $br, $br;

echo "<strong>imap_utf7_decode test</strong>", $br;

$test = imap_utf7_decode($folder);
echo "folder decoded: ", $test, $br;
$test = imap_utf7_encode($test);
echo "  encoded again: ", $test, $br;
$test = imap_utf7_decode($test);
echo "  decoded again: ", $test, $br, $br;

$test = imap_utf7_encode($plain);
echo "  plain encoded: ", $test, $br;
//$test = imap_utf7_encode($unicode);
//echo "unicode encoded: ", $test, $br;
$test = imap_utf7_decode($test);
echo "  decoded: ", $test, $br;

echo "</body></html>";

?>

And here is the output:
folder (modified UTF-7): &AMk-l&AOk-ments envoy&AOk-s
plain (Latin1): Bo?te de r?ception

mb_convert_encoding test
folder decoded: ?l?ments envoy?s
encoded again: &AMk-l&AOk-ments envoy&AOk-s
decoded again: ?l?ments envoy?s

plain encoded: Bo&AO4-te de r&AOk-ception
decoded: Bo?te de r?ception

imap_utf7_decode test
folder decoded: ?l?ments envoy?s
encoded again: &A8l-l&A+p-ments envoy&AfM-s
decoded again: ?l?ments envoy?s

plain encoded: Bo&7s-te de r&6g-ception
decoded: Bo?te de r?ception

--------------

As you see I've found a work around by using the function mb_convert_encoding.
Is imap_utf7_* really broken or what?

Thanks.

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports

[2002-06-27 00:15 UTC] sniper@php.net

Thank you for taking the time to report a problem with PHP.
Unfortunately your version of PHP is too old -- the problem
might already be fixed. Please download a new PHP
version from http://www.php.net/downloads.php

If you are able to reproduce the bug with one of the latest
versions of PHP, please change the PHP version on this bug report
to the version you tested and change the status back to "Open".
Again, thank you for your continued support of PHP.

[2002-07-29 10:33 UTC] gamid at isayev dot net

I have the same problem with imap_utf7_decode() in PHP v4.2.2
Script 'test_utf7.php3' is attached to this posting to illustrate this problem.

According PHP documentation imap_utf7_decode() returns "the decoded 8bit data", but documentation says nothing about encoding of returned "8bit data". When I try decode folder with name 'test&AN9ZJw-', imap_utf7_decode() returns following string:

0x74, 0x65, 0x73, 0x74, 0x00, 0xDF, 0x59, 0x27

It looks as UTF-16 (UCS-2) string with missed '0x00' for ASCII characters. If I'm right and imap_utf7_decode() returns UTF-16 string, this string should be represented as:

0x00, 0x74, 0x00, 0x65, 0x00, 0x73, 0x00, 0x74, 0x00, 0xDF, 0x59, 0x27

To fix this this problem I wrote patch for ext/imap/php_imap.c and attache it to this posting.

Best regards,
Gamid Isayev

--- test_utf7.php3 ------------------------------------
<HTML>
<HEAD>
<TITLE>Test UTF7</TITLE>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html;charset=utf-8">
</HEAD>
<BODY>
<?
$folder = 'test&AN9ZJw-';
echo "folder (modified UTF-7): $folder<BR><BR>\n";

echo "<strong>mb_convert_encoding test</strong><BR>\n";
$test = $folder;
$test = mb_convert_encoding($test, "UTF-8", "UTF7-IMAP");
echo "  folder decoded: [$test]<BR>\n";
$test = mb_convert_encoding($test, "UTF7-IMAP", "UTF-8");
echo "encoded again: [", $test, "]<BR>\n";
$test = mb_convert_encoding($test, "UTF-8", "UTF7-IMAP");
echo "decoded again: [", $test, "]<BR><BR>\n";

echo "<strong>imap_utf7_decode test</strong><BR>\n";
$test = $folder;
$test = imap_utf7_decode($test);
echo "folder decoded: [", $test, "]<BR>\n";
$test = imap_utf7_encode($test);
echo "encoded again: [", $test, "]<BR>\n";
$test = imap_utf7_decode($test);
echo "decoded again: [", $test, "]<BR><BR>\n";
?>
</BODY>
</HTML>
--- end of test_utf7.php3 -----------------------------

--- ext/imap/php_imap.c -------------------------------

--- php_imap.c  Fri Jul 26 17:25:10 2002
+++ php_imap.c  Fri Jul 26 17:26:28 2002
@@ -2215,7 +2215,7 @@
                                php_error(E_WARNING, "imap_utf7_decode: Invalid modified UTF-7 character: `%c'", *inp);
                                RETURN_FALSE;
                        } else if (*inp != '&') {
-                               outlen++;
+                               outlen += 2;
                        } else if (inp + 1 == endp) {
                                php_error(E_WARNING, "imap_utf7_decode: Unexpected end of string");
                                RETURN_FALSE;
@@ -2272,8 +2272,11 @@
                        if (*inp == '&' && inp[1] != '-') {
                                state = ST_DECODE0;
                        }
-                       else if ((*outp++ = *inp) == '&') {
-                               inp++;
+                       else {
+                               *outp++ = 0x00;
+                               if ((*outp++ = *inp) == '&') {
+                                       inp++;
+                               }
                        }
                }
                else if (*inp == '-') {

--- end of ext/imap/php_imap.c ------------------------

[2002-07-29 15:51 UTC] gamid at isayev dot net

This is the updated patch for 'ext/imap/php_imap.c':

--- php_imap.c  Mon Jul 29 15:17:45 2002
+++ php_imap.c  Mon Jul 29 15:18:27 2002
@@ -2215,14 +2215,14 @@
                                php_error(E_WARNING, "imap_utf7_decode: Invalid modified UTF-7 character: `%c'", *inp);
                                RETURN_FALSE;
                        } else if (*inp != '&') {
-                               outlen++;
+                               outlen += 2;
                        } else if (inp + 1 == endp) {
                                php_error(E_WARNING, "imap_utf7_decode: Unexpected end of string");
                                RETURN_FALSE;
                        } else if (inp[1] != '-') {
                                state = ST_DECODE0;
                        } else {
-                               outlen++;
+                               outlen += 2;
                                inp++;
                        }
                } else if (*inp == '-') {
@@ -2272,8 +2272,11 @@
                        if (*inp == '&' && inp[1] != '-') {
                                state = ST_DECODE0;
                        }
-                       else if ((*outp++ = *inp) == '&') {
-                               inp++;
+                       else {
+                               *outp++ = 0x00;
+                               if ((*outp++ = *inp) == '&') {
+                                       inp++;
+                               }
                        }
                }
                else if (*inp == '-') {

[2002-08-06 19:02 UTC] gamid at isayev dot net

JFYI:

cvs -d :pserver:cvsread@cvs.php.net:/repository co php4
cd php4/ext/imap/
cvs update -r 1.112.2.1 php_imap.c
patch php_imap.c _file_with_my_patch_
cvs update -A php_imap.c
cvs ci php_imap.c

As result:

--- php_imap.c  5 Aug 2002 21:53:09 -0000       1.134
+++ php_imap.c  6 Aug 2002 23:00:31 -0000
@@ -2077,14 +2077,14 @@
                                php_error(E_WARNING, "%s(): Invalid modified UTF-7 character: `%c'", get_active_function_name(TSRMLS_C), *inp);
                                RETURN_FALSE;
                        } else if (*inp != '&') {
-                               outlen++;
+                               outlen += 2;
                        } else if (inp + 1 == endp) {
                                php_error(E_WARNING, "%s(): Unexpected end of string", get_active_function_name(TSRMLS_C));
                                RETURN_FALSE;
                        } else if (inp[1] != '-') {
                                state = ST_DECODE0;
                        } else {
-                               outlen++;
+                               outlen += 2;
                                inp++;
                        }
                } else if (*inp == '-') {
@@ -2134,8 +2134,11 @@
                        if (*inp == '&' && inp[1] != '-') {
                                state = ST_DECODE0;
                        }
-                       else if ((*outp++ = *inp) == '&') {
-                               inp++;
+                       else {
+                               *outp++ = 0x00;
+                               if ((*outp++ = *inp) == '&') {
+                                       inp++;
+                               }
                        }
                }
                else if (*inp == '-') {

[2002-08-07 11:11 UTC] robert dot marchand at umontreal dot ca

Hi,

 here is what I obtain with the patch applied (php-4.2.2):

---- output from Netscape 4.7: ---------
folder (modified UTF-7): test&AN9ZJw-

mb_convert_encoding test
folder decoded: [test??]
encoded again: [test&AN9ZJw-]
decoded again: [test??]

imap_utf7_decode test
folder decoded: [
encoded again: [&AA-t&AA-e&AA-s&AA-t&ANE-Y$]
decoded again: [

-----------------------------

Maybe there is something wrong with my installation here, but even then, I'm not sure adding zero bytes is the good solution.  In my view it will break every application that now use these functions.  At least this patch should be used only when --enable-mbstring is used.

Thanks.

[2002-08-07 11:30 UTC] gamid at isayev dot net

1) Patched imap_utf7_decode() returns UTF-16 encoded string. So, to display string properly you should set charset to 'UTF-16' or convert UTF-16 string into preferred charset.
2) Current imap_utf7_encode() "converts 8bit data to modified UTF-7 text". Question is what mean "8bit data"?

Right now I'm working on the patch for imap_utf7_encode() to add support for UTF-16 on input. It will allow do right covertion UTF-7 -> imap_utf7_decode() -> imap_utf7_encode() -> UTF-7.

[2002-08-08 15:05 UTC] gamid at netilla dot com

Robert,

The following patch fixes both imap_utf7_encode() and imap_utf7_decode() to work with UTF-16.

PS: this patch is for PHP 4.2.2, the patch for CVS is posted in the php.dev

Gamid Isayev

--- php_imap.c  Wed Aug  7 15:45:53 2002
+++ php_imap.c  Thu Aug  8 14:24:16 2002
@@ -2215,14 +2215,14 @@
                                php_error(E_WARNING, "imap_utf7_decode: Invalid modified UTF-7 character: `%c'", *inp);
                                RETURN_FALSE;
                        } else if (*inp != '&') {
-                               outlen++;
+                               outlen += 2;
                        } else if (inp + 1 == endp) {
                                php_error(E_WARNING, "imap_utf7_decode: Unexpected end of string");
                                RETURN_FALSE;
                        } else if (inp[1] != '-') {
                                state = ST_DECODE0;
                        } else {
-                               outlen++;
+                               outlen += 2;
                                inp++;
                        }
                } else if (*inp == '-') {
@@ -2272,8 +2272,11 @@
                        if (*inp == '&' && inp[1] != '-') {
                                state = ST_DECODE0;
                        }
-                       else if ((*outp++ = *inp) == '&') {
-                               inp++;
+                       else {
+                               *outp++ = 0x00;
+                               if ((*outp++ = *inp) == '&') {
+                                       inp++;
+                               }
                        }
                }
                else if (*inp == '-') {
@@ -2349,29 +2352,42 @@
        outlen = 0;
        state = ST_NORMAL;
        endp = (inp = in) + inlen;
-       while (inp < endp) {
+       while (inp < endp || state != ST_NORMAL) {
                if (state == ST_NORMAL) {
-                       if (SPECIAL(*inp)) {
+                       if (*inp == 0x00 && *(inp+1) < 0x80) {
+                               /* ASCII character */
+                               outlen++;               // for ASCII char
+                               if (*(inp+1) == '&')
+                                       outlen++;       // for '-'
+                               inp += 2;
+                       } else {
+                               /* begin encoding */
                                state = ST_ENCODE0;
-                               outlen++;
-                       } else if (*inp++ == '&') {
+                               outlen++;       // for '&'
+                       }
+               } else if (inp == endp || (*inp == 0x00 && *(inp+1) < 0x80)) {
+                       /* flush overflow and terminate region */
+                       if (state != ST_ENCODE0) {
                                outlen++;
                        }
-                       outlen++;
-               } else if (!SPECIAL(*inp)) {
+                       outlen++;       // for '-'
                        state = ST_NORMAL;
                } else {
-                       /* ST_ENCODE0 -> ST_ENCODE1     - two chars
-                        * ST_ENCODE1 -> ST_ENCODE2     - one char
-                        * ST_ENCODE2 -> ST_ENCODE0     - one char
-                        */
-                       if (state == ST_ENCODE2) {
-                               state = ST_ENCODE0;
-                       }
-                       else if (state++ == ST_ENCODE0) {
-                               outlen++;
+                       switch (state) {
+                               case ST_ENCODE0:
+                                       outlen++;
+                                       state = ST_ENCODE1;
+                                       break;
+                               case ST_ENCODE1:
+                                       outlen++;
+                                       state = ST_ENCODE2;
+                                       break;
+                               case ST_ENCODE2:
+                                       outlen += 2;
+                                       state = ST_ENCODE0;
+                               case ST_NORMAL:
+                                       break;
                        }
-                       outlen++;
                        inp++;
                }
        }
@@ -2388,14 +2404,17 @@
        endp = (inp = in) + inlen;
        while (inp < endp || state != ST_NORMAL) {
                if (state == ST_NORMAL) {
-                       if (SPECIAL(*inp)) {
+                       if (*inp == 0x00 && *(inp+1) < 0x80) {
+                               /* ASCII character */
+                               inp++;
+                               if ((*outp++ = *inp++) == '&')
+                                       *outp++ = '-';
+                       } else {
                                /* begin encoding */
                                *outp++ = '&';
                                state = ST_ENCODE0;
-                       } else if ((*outp++ = *inp++) == '&') {
-                               *outp++ = '-';
                        }
-               } else if (inp == endp || !SPECIAL(*inp)) {
+               } else if (inp == endp || (*inp == 0x00 && *(inp+1) < 0x80)) {
                        /* flush overflow and terminate region */
                        if (state != ST_ENCODE0) {
                                *outp++ = B64(*outp);

[2002-08-08 16:49 UTC] kalowsky@php.net

Since I have no way to test this, can anyone else confirm or deny that this patch works?  I'd rather not commit blindly.

[2002-08-08 17:23 UTC] robert dot marchand at umontreal dot ca

Hi,

   this will not work without changing current applications.
As it is now, 8 bit is expected from imap_utf7_decode.  The problem is that these function try to encode and decode without knowing the charset used.  It should really be:

imap_utf7_utf8_decode
imap_utf7_utf16_decode (patched version)
imap_utf8_utf7_encode
imap_utf16_utf7_encode (patched version)

Thanks.

[2002-08-09 10:57 UTC] gamid at isayev dot net

Robert Marchand wrote:
> this will not work without changing current applications.

Now it is not working at all for non-ASCII characters.
Example:
For IMAP folder name "test&WSc-" ("test" + chinese character) current imap_utf7_decode() returns "testY'"
For IMAP folder name "testY'", current imap_utf7_decode() also returns "testY'"
So, what you will do in this case?

> The problem is that these function try to encode and decode without
> knowing the charset used.

1) imap_utf7_decode() does not need to know charset of input string, because input string is encoded in modified UTF7
2) if you specify charset for imap_utf7_decode() output string, what will you do when IMAP folder name has characters from different charsets (example: "test&BCQA31kn-" - ASCII, Russian, German, Chinese)?

> As it is now, 8 bit is expected from imap_utf7_decode.
<...skiped...>
> It should really be:
> imap_utf7_utf8_decode
> imap_utf7_utf16_decode (patched version)
> imap_utf8_utf7_encode
> imap_utf16_utf7_encode (patched version)

I think you are confusing "8 bit" and UTF-8.
UTF-8 encoded character is "8 bit" only for ASCII characters. For non-ASCII characters UTF-8 will be two and more bytes. So, imap_utf7_decode() != imap_utf7_utf8_decode().

Gamid Isayev

[2002-08-12 13:57 UTC] robert dot marchand at umontreal dot ca

Hi,

  you're write about the "utf8" thing.  I was meaning "8bit".  For the rest, I cannot change the software I use (Horde/IMP) because it is not me who wrote it.  I can assure you it will break if you go with your mods.  This is for the general problem.

Now it seems I have a specific problem here with my SGI platform.  Here is what I get with your patch:

folder (modified UTF-7): test&AN9ZJw-

mb_convert_encoding test
folder decoded: [test?Y大]
(hexa: 74 65 73 74 c3 9f e5 a4 a7 )
encoded again: [test&AN9ZJw-]
decoded again: [test?Y大]
(hexa: 74 65 73 74 c3 9f e5 a4 a7 )

imap_utf7_decode test
folder decoded: [
(hexa: 0 74 0 65 0 73 0 74 0 d0 59 24 )
encoded again: [test&ANBZJA-]
decoded again: [
(hexa: 0 74 0 65 0 73 0 74 5 d9 59 24 )

Here is another sample:
folder (modified UTF-7): &AMk-l&AOk-ments envoy&AOk-s

mb_convert_encoding test
folder decoded: [??léments envoyés]
(hexa: c3 89 6c c3 a9 6d 65 6e 74 73 20 65 6e 76 6f 79 c3 a9 73 )
encoded again: [&AMk-l&AOk-ments envoy&AOk-s]
decoded again: [??léments envoyés]
(hexa: c3 89 6c c3 a9 6d 65 6e 74 73 20 65 6e 76 6f 79 c3 a9 73 )

imap_utf7_decode test
folder decoded: [ ?
(hexa: 6 d3 0 6c 0 e0 0 6d 0 65 0 6e 0 74 0 73 0 20 0 65 0 6e 0 76 0 6f 0 79 0 e0 0 73 )
encoded again: [&BPp-l&A,g-ments envoy&Afa-s]
decoded again: [ ?
(hexa: 4 fb 0 6c 0 fb 0 6d 0 65 0 6e 0 74 0 73 0 20 0 65 0 6e 0 76 0 6f 0 79 0 fc 0 73 )

Here is the PHP test page to generate this output:

<HTML>
 <HEAD>
 <TITLE>Test UTF7</TITLE>
 <META HTTP-EQUIV="Content-Type" CONTENT="text/html;charset=utf-16">
 </HEAD>
 <BODY>
 <?

 function hexstr($s)
 {
	echo "(hexa: ";
	 for ($i=0;$i<strlen($s);$i++) {
		echo dechex(ord($s[$i])), " ";
 	}
	 echo ")<br>";
 }

 //$folder = 'test&AN9ZJw-';
 $folder = '&AMk-l&AOk-ments envoy&AOk-s';
 echo "folder (modified UTF-7): $folder<BR><BR>\n";
 echo "<strong>mb_convert_encoding test</strong><BR>\n";
 $test = $folder;
 $test = mb_convert_encoding($test, "UTF-8", "UTF7-IMAP");
 echo "  folder decoded: [$test]<BR>\n";
 hexstr($test);
 $test = mb_convert_encoding($test, "UTF7-IMAP", "UTF-8");
 echo "encoded again: [", $test, "]<BR>\n";
 $test = mb_convert_encoding($test, "UTF-8", "UTF7-IMAP");
 echo "decoded again: [", $test, "]<BR>\n";
 hexstr($test);

 echo "<BR><strong>imap_utf7_decode test</strong><BR>\n";
 $test = $folder;
 $test = imap_utf7_decode($test);
 echo "folder decoded: [", $test, "]<BR>\n";
 hexstr($test);
 $test = imap_utf7_encode($test);
 echo "encoded again: [", $test, "]<BR>\n";
 $test = imap_utf7_decode($test);
 echo "decoded again: [", $test, "]<BR>\n";
 hexstr($test);
 ?>
 </BODY>
 </HTML>


I am on a 64 bit platform. Could this be related to a wrapping shift?  There is definitly something wrong here.

Thanks.

[2002-08-12 15:47 UTC] robert dot marchand at umontreal dot ca

Hi,
  
   this is to confirm that there is a SGI specific issue because I tested the new patch on a Linux Redhat 7.2 System with PHP 4.2.2 compiled manually.  Here is the output from the test:

folder (modified UTF-7): &AMk-l&AOk-ments envoy&AOk-s

mb_convert_encoding test
folder decoded: [??léments envoyés]
(hexa: c3 89 6c c3 a9 6d 65 6e 74 73 20 65 6e 76 6f 79 c3 a9 73 )
encoded again: [&AMk-l&AOk-ments envoy&AOk-s]
decoded again: [??léments envoyés]
(hexa: c3 89 6c c3 a9 6d 65 6e 74 73 20 65 6e 76 6f 79 c3 a9 73 )

imap_utf7_decode test
folder decoded: [
(hexa: 0 c9 0 6c 0 e9 0 6d 0 65 0 6e 0 74 0 73 0 20 0 65 0 6e 0 76 0 6f 0 79 0 e9 0 73 )
encoded again: [&AMk-l&AOk-ments envoy&AOk-s]
decoded again: [
(hexa: 0 c9 0 6c 0 e9 0 6d 0 65 0 6e 0 74 0 73 0 20 0 65 0 6e 0 76 0 6f 0 79 0 e9 0 73 )

I have also try on an O2 SGI box (it is 32 bit) and it is the same as all SGI boxes I have tried.

I'll take a look when I have a moment.

Thanks.

[2002-08-13 13:29 UTC] robert dot marchand at umontreal dot ca

Hi,

    I've found the culprit in regards to the SGI problem.  It is related to auto-increment operator and complex assignment.  This doesn't work on SGI:

*outp++ |= outp[1] >> 2;

Here is my patch that correct the two function on SGI with the SGI Compiler (MIPSPRO):


--- php_imap.c.nowarn   Tue Jul 30 10:04:24 2002
+++ php_imap.c  Tue Aug 13 11:44:50 2002
@@ -2187,6 +2187,7 @@
        zval **arg;
        const unsigned char *in, *inp, *endp;
        unsigned char *out, *outp;
+       unsigned char c;
        int inlen, outlen;
        enum {
                ST_NORMAL,      /* printable text */
@@ -2289,13 +2290,15 @@
                                break;
                        case ST_DECODE1:
                                outp[1] = UNB64(*inp);
-                               *outp++ |= outp[1] >> 4;
+                               c = outp[1] >> 4;
+                               *outp++ |= c;
                                *outp <<= 4;
                                state = ST_DECODE2;
                                break;
                        case ST_DECODE2:
                                outp[1] = UNB64(*inp);
-                               *outp++ |= outp[1] >> 2;
+                               c = outp[1] >> 2;
+                               *outp++ |= c;
                                *outp <<= 6;
                                state = ST_DECODE3;
                                break;
@@ -2329,6 +2332,7 @@
        zval **arg;
        const unsigned char *in, *inp, *endp;
        unsigned char *out, *outp;
+       unsigned char c;
        int inlen, outlen;
        enum {
                ST_NORMAL,      /* printable text */
@@ -2399,7 +2403,8 @@
                } else if (inp == endp || !SPECIAL(*inp)) {
                        /* flush overflow and terminate region */
                        if (state != ST_ENCODE0) {
-                               *outp++ = B64(*outp);
+                               c = B64(*outp);
+                               *outp++ = c;
                        }
                        *outp++ = '-';
                        state = ST_NORMAL;
@@ -2412,12 +2417,14 @@
                                        state = ST_ENCODE1;
                                        break;
                                case ST_ENCODE1:
-                                       *outp++ = B64(*outp | *inp >> 4);
+                                       c = B64(*outp | *inp >> 4);
+                                       *outp++ = c;
                                        *outp = *inp++ << 2;
                                        state = ST_ENCODE2;
                                        break;
                                case ST_ENCODE2:
-                                       *outp++ = B64(*outp | *inp >> 6);
+                                       c = B64(*outp | *inp >> 6);
+                                       *outp++ = c;
                                        *outp++ = B64(*inp++);
                                        state = ST_ENCODE0;
                                case ST_NORMAL:

This patch was applied to the original php_imap.c (4.2.2) but it can also be applied to the new version from Gamid Isayev.

Thanks.

[2002-08-14 16:16 UTC] spc at sgi dot com

The patch added by Robert Marchand corrects a deficiency
in the original code.  The statement

 *outp++ |= outp[1] >> 2;

in the ext/imap/php_imap.c file is clearly non-standard C.
This means that the result is *compiler*-dependent, and
no compiler can be considered "wrong".

The C standard says that the order in which operands of
an assignment operator are evaluated is undefined.  In
other words, it is equally correct for a compiler to produce
code equivalent to

 temp = outp[1] >> 2;
 *outp++ |= temp;

or to produce code equivalent to

 temp = outp;
 outp++;
 *temp |= outp[1] >> 2; 

Since these are not equivalent bits of code, the *source
code* is wrong, not the compiler.

Robert's patch suggested on 13 August corrects this non-standard source code.

[2002-08-14 16:56 UTC] kalowsky@php.net

I've commited the SGI compiler patch to the CVS head.  I have yet to see any real conclusion though on the utf7_decode() issue, and I really would not prefer to break BC.  If there is some kind of agreement upon all whom this bug effects I'm open to it.

[2002-09-05 15:50 UTC] gamid at isayev dot net

Are you going to fix this bug in PHP 4.2.3?

[2002-09-09 15:22 UTC] robert dot marchand at umontreal dot ca

Hi,

the fix for the SGI compiler has not made it in 4.2.3. According to the cvs repository, php_imap.c 1.112.2.3 has it but not the version 1.112.2.4.

Bye.

[2002-09-09 16:00 UTC] kalowsky@php.net

re-applied patch for SGI.

[2002-09-09 17:33 UTC] sniper@php.net

fixed -> closed.

[2002-09-09 22:51 UTC] kalowsky@php.net

All I did was re-commit the SGI compiler changes.  I did not do any of the work for the utf7_decode/encode function as this bug states.

[2002-11-14 09:50 UTC] thomas dot jarosch at intra2net dot de

Hi,

since it's now two months since the last comment on this bug, will Gamid Isayev's patches be applied or not?

People really have problems with this bug, as e.g. most webmail applications won't work properly. At least german/austrian/finnish/swedish people have Umlauts in their folder names just to name a few.

Best regards,
Thomas

[2002-11-14 10:28 UTC] kalowsky@php.net

From my understanding Gamid's patches break BC.  This is a bad thing typically, and repeated requests for users to voice an opinion have not resulted in any feedback.  At this point I feel that the IMAP maintainer should decide if this goes in or not, but he has not answered any emails either.   So until I hear back from the IMAP maintainer I don't see this being applied.

[2002-11-14 11:01 UTC] gamid at isayev dot net

Ok, let fix imap_utf7_encode()/imap_utf7_decode() functions to return UTF-8. In this case they will be BC, when IMAP folder has only ISO-8859-1 (ASCII) characters in its name and will also support international characters.

Gamid Isayev

[2002-11-21 04:20 UTC] thomas dot jarosch at intra2net dot de

Has the IMAP maintainer responded?

As the patch produces UTF16-BE output/input,
would it be "legal" to call another PHP function
(iconv) within PHP or is it not allowed to depend
on another "PHP" function?

One could easily add an iconv call at the end of the functions to get the output format right. Or even add an optional paramater for the imap_utf7_decode function to specify the desired output format, "UTF8" as default.

[2003-01-20 13:37 UTC] iliaa@php.net

Please try using this CVS snapshot:

  http://snaps.php.net/php4-STABLE-latest.tar.gz
 
For Windows:
 
  http://snaps.php.net/win32/php4-win32-STABLE-latest.zip

Seems to work properly with the latest CVS.

[2003-02-04 16:45 UTC] sniper@php.net

No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Open". Thank you.

	php.net \| support \| documentation \| report a bug \| advanced search \| search howto \| statistics \| random bug \| login
go to bug id or search bugs for


Copyright © 2001-2025 The PHP Group All rights reserved.	Last updated: Sat Nov 08 00:00:02 2025 UTC