|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2008-06-19 07:53 UTC] yar_helg at mail dot ru
Description: ------------ When trying to operate with binary data using substr and mb_internal_encoding is set to UTF-8 (but no function overloading is set) substr works wrong - wrong number of bytes is returned after function call P.S. emptyfile.xls used in example is an empty MS Excel 2003 file. It can be downloaded at http://an-best.ru/empty_file.xls (13/5 Kbytes) P.P.S. IDENTIFIER_OLE constant is taken from Spreadsheet_excel_reader class. Reproduce code: --------------- <?php echo "function overload = ".ini_get('mbstring.func_overload')."<br />\n"; // Uncomment this for demonstration of wrong behaviour //mb_internal_encoding('UTF-8'); echo "MB_INTERNAL_ENCODING =".mb_internal_encoding()."<br />\n"; define('IDENTIFIER_OLE', pack("CCCCCCCC",0xd0,0xcf,0x11,0xe0,0xa1,0xb1,0x1a,0xe1)); $data = file_get_contents($_SERVER['DOCUMENT_ROOT'].'/substr_bug/emptyfile.xls'); echo "Data length = ".strlen($data)."<br />\n"; echo "First 8 symbols ==>".var_export(substr($data,0,8),1)."<== <br />\n"; echo "Compare result (substr(\$data,0,8)==IDENTIFIER_OLE) - ".var_export(substr($data,0,8)==IDENTIFIER_OLE,1)."<br />\n"; echo "Substring length (substr(\$data,0,8)) - ".strlen(substr($data,0,8))."<br />\n"; ?> Expected result: ---------------- function overload = 0 MB_INTERNAL_ENCODING =ISO-8859-1 Data length = 13824 First 8 symbols ==>'поЮ║╠А'<== Compare result (substr($data,0,8)==IDENTIFIER_OLE) - true Substring length (substr($data,0,8)) - 8 Actual result: -------------- // This result can be seen if mb_internal_encoding is set to UTF-8 function overload = 0 MB_INTERNAL_ENCODING =UTF-8 Data length = 13824 First 8 symbols ==>'поЮ║╠А' . "\0" . '' . "\0" . '' . "\0" . '' . "\0" . '' . "\0" . ''<== Compare result (substr($data,0,8)==IDENTIFIER_OLE) - false Substring length (substr($data,0,8)) - 13 PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Sat Oct 25 05:00:02 2025 UTC |
Таже самая проблема. При установленном mb_internal_encoding('UTF-8'); substr($string, 0, 2) случайным образом возвращает, то 2 байта, то 4. Cтрока в utf8. Linux 2.6.24-gentoo-r4 #2 SMP PHP 5.2.6-pl2-gentoo mbstring.func_overload = 0