|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
[2005-12-15 01:28 UTC] tony2001@php.net
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Sun Oct 26 03:00:01 2025 UTC |
Description: ------------ Hello, if the function imap_8bit reaches the maximum length for a line, it splits the line regardless if the last triplet is the last one of a series or if it is only a beginning of a multibyte character ... I consider this behaviour to be a little bit problematic ... this does not directly violate the RFC 2045, but, reading it carefully: WARNING TO IMPLEMENTORS: If binary data is encoded in quoted-printable, care must be taken to encode CR and LF characters as "=0D" and "=0A", respectively. In particular, a CRLF sequence in binary data should be encoded as "=0D=0A". Otherwise, if CRLF were represented as a hard line break, it might be incorrectly decoded on platforms with different line break conventions. - the "spirit" of the RFC2045 says that sequences with exact meaning (as indivisible sequences) should be encoded together The only problem is, how the function should know that it should take care of specific encoding instead of working with separate 8bit chars ... locale? a parameter? p.s. the string within the code should be encoded in UTF-8, it is taken from a real world problem ... it was on PHP 4.4.0 (I chose the closest pick), but it behaves the same on 5.0.4 too p.p.s. sorry for my English :-) Reproduce code: --------------- <?php echo (imap_8bit ("SN: Upozorněn? na novou akci - V?nočn? Jablkoň")); ?> Expected result: ---------------- SN: Upozorn=C4=9Bn=C3=AD na novou akci - V=C3=A1no=C4=8Dn=C3=AD Jablko= =C5=88 Actual result: -------------- SN: Upozorn=C4=9Bn=C3=AD na novou akci - V=C3=A1no=C4=8Dn=C3=AD Jablko=C5= =88