|   | php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login | 
| 
  [2008-02-07 18:31 UTC] sergey89 at gmail dot com
 Description:
------------
substr don't work correctly with binary strings on FreeBSD 6.3, PHP 5.2.5. I have some binary file. When i tried to cut part of data i get incorrect result.
-----
mbstring.function_overload 0
Reproduce code:
---------------
<?php
$data = file_get_contents('data');
print md5($data) . ' | ';
print md5(substr($data, 0, -88));
?>
Expected result:
----------------
45e26dc33aad8e93f3f45c8d5100feb0 | 03d900cc2ba7276fb3bb3f1939303e3b
Actual result:
--------------
45e26dc33aad8e93f3f45c8d5100feb0 | d1cea9d93cb48b2d897595f5e96ba352
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits             | |||||||||||||||||||||||||||||||||||||
|  Copyright © 2001-2025 The PHP Group All rights reserved. | Last updated: Fri Oct 31 17:00:02 2025 UTC | 
I generate data with simple PHP script: <?php file_put_contents('data', ''); for ($i = 0; $i < 1024; $i++){ file_put_contents('data', chr(rand(0, 255)), FILE_APPEND); } However in php-cli substr work correctly.I have something like that on PHP 5.2.6 on Linux. Here is a test script file: =========================== <?php echo "OS is ".PHP_OS."<br />\n"; echo "PHP is ".phpversion()."<br />\n"; echo "function overload = ".ini_get('mbstring.func_overload')."<br />\n"; //mb_internal_encoding('UTF-8'); echo "MB_INTERNAL_ENCODING =".mb_internal_encoding()."<br />\n"; define('IDENTIFIER_OLE', pack("CCCCCCCC",0xd0,0xcf,0x11,0xe0,0xa1,0xb1,0x1a,0xe1)); $data = file_get_contents($_SERVER['DOCUMENT_ROOT'].'/substr_bug/empty file.xls'); echo "Data length = ".strlen($data)."<br />\n"; echo "First 8 symbols ==>".var_export(substr($data,0,8),1)."<== <br />\n"; echo "Compare result (substr(\$data,0,8)==IDENTIFIER_OLE) - ".var_export(substr($data,0,8)==IDENTIFIER_OLE,1)."<br />\n"; echo "Substring length (substr(\$data,0,8)) - ".strlen(substr($data,0,8))."<br />\n"; ?> Output: ======= OS is Linux PHP is 5.2.6 function overload = 0 MB_INTERNAL_ENCODING =ISO-8859-1 Data length = 13824 First 8 symbols ==>'поЮ║╠А'<== Compare result (substr($data,0,8)==IDENTIFIER_OLE) - true Substring length (substr($data,0,8)) - 8 But if you uncomment line with mb_internal_encoding('UTF-8'); output will be changed like that (look at the file size, result of substr and length of substr result)... Output with mb_internal_encoding=UTF-8: ======================================= OS is Linux PHP is 5.2.6 function overload = 0 MB_INTERNAL_ENCODING =UTF-8 Data length = 13824 First 8 symbols ==>'поЮ║╠А' . "\0" . '' . "\0" . '' . "\0" . '' . "\0" . '' . "\0" . ''<== Compare result (substr($data,0,8)==IDENTIFIER_OLE) - false Substring length (substr($data,0,8)) - 13 mbstring.func_overload is set to 0 in .htaccess file in current dir. "empty file.xls" is an empty MS Excel 2003 file. Can be downloaded from http://an-best.ru/empty_file.xls