php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #10489 quoted_printable_decode() & imap_qprint() decodes control codes.
Submitted: 2001-04-25 09:51 UTC Modified: 2001-05-05 04:40 UTC
From: slaygon at NOSPAMcensor dot net Assigned:
Status: Closed Package: IMAP related
PHP Version: 4.0.4pl1 OS: Linux 2.4.3
Private report: No CVE-ID: None
 [2001-04-25 09:51 UTC] slaygon at NOSPAMcensor dot net
quoted_printable_decode() and imap_qprint() decodes control codes aswell. This, I assume, is not supposed to happen.

Also, these functions happily translate anything with two characters after the = sign (like =EV in =EVIL), which they shouldn't.

Example:

$foo="This is =00=01 =E4 \"=20\" =21 =ev =FE string.";
echo "Quoting string \"".$foo."\"\n";
echo "quoted_printable_decode: \"".quoted_printable_decode($foo)."\"\n";
echo "preg_replace:            \"".preg_replace("/\=([2-9A-Fa-f])([0-9A-Fa-f])/e", "''.chr(hexdec('\\1\\2')).''", $foo)."\"\n";

Would result in:
Quoting string "This is =00=01 =E4 =2B "=20" =21 =ev =FE string."
quoted_printable_decode: "This is  ? + " " ! ? ? string."
preg_replace:            "This is =00=01 ? + " " ! =ev ? string."


Am I wrong in assuming that quoted_printable_decode() and imap_qprint() should act as my preg_replace line above does?

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2001-04-29 06:40 UTC] jmoore@php.net
reclassify
 [2001-05-05 04:40 UTC] vlad@php.net
Not really a bug. Intended behaviour.

Your $foo is not a valid quoted-printable encoded string (can you get it with imap_8bit()? If you can, reopen this bug).

RFC 2045 section 6.7 parts 1 & 2 say that equal sign by itself (ASCII 0x3D) can not be represented literally, but onle as encoded "=3D" (hence the '=' at the end of the lines is not an equal sign, but a soft break).

Reading below, in note (2):
>An "=" followed by a character that is neither a
>hexadecimal digit (including "abcdef") nor the CR
>character of a CRLF pair is illegal.  This case can be
>the result of US-ASCII text having been included in a
>quoted-printable part of a message without itself
>having been subjected to quoted-printable encoding.  A
>reasonable approach by a robust implementation might be
>to include the "=" character and the following
>character in the decoded data without any
>transformation and, if possible, indicate to the user
>that proper decoding was not possible at this point in
>the data.

Which means that not decoding them at all *might* be a good way to handle that, but then it would also be nice to let the user know of the error, which we can't (using imap_qprint() at least).


Finally, formal rule for acceptable characters (end of section 6.7 in that RFC) says:


>safe-char := <any octet with decimal value of 33 through
>             60 inclusive, and 62 through 126>
>             ; Characters not listed as "mail-safe" in
>             ; RFC 2049 are also not recommended.
>
>hex-octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F")
>             ; Octet must be used for characters > 127, =,
>             ; SPACEs or TABs at the ends of lines, and is
>             ; recommended for any character not listed in
>             ; RFC 2049 as "mail-safe".


Notice, that octet *must* be used for '=' sign. In short, you can create a regular expression that can handle encoded data that is malformed (yours is not quite correct), but that is not the way PHP should behave. On top of that, imap_qprint invokes a function in c_client, so if you want to change that behaviour, you need to change c_client.

Vlad


 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Apr 23 09:01:27 2024 UTC