php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #45311 substr works incorrectly with binary data
Submitted: 2008-06-19 07:53 UTC Modified: 2008-09-09 13:29 UTC
Votes:3
Avg. Score:4.3 ± 0.9
Reproduced:2 of 2 (100.0%)
Same Version:1 (50.0%)
Same OS:1 (50.0%)
From: yar_helg at mail dot ru Assigned:
Status: Closed Package: mbstring related
PHP Version: 5.2.6 OS: *
Private report: No CVE-ID: None
 [2008-06-19 07:53 UTC] yar_helg at mail dot ru
Description:
------------
When trying to operate with binary data using substr and mb_internal_encoding is set to UTF-8 (but no function overloading is set) substr works wrong - wrong number of bytes is returned after function call

P.S. emptyfile.xls used in example is an empty MS Excel 2003 file. It can be downloaded at http://an-best.ru/empty_file.xls (13/5 Kbytes)

P.P.S. IDENTIFIER_OLE constant is taken from Spreadsheet_excel_reader class.

Reproduce code:
---------------
<?php

echo "function overload = ".ini_get('mbstring.func_overload')."<br />\n";

// Uncomment this for demonstration of wrong behaviour
//mb_internal_encoding('UTF-8');


echo "MB_INTERNAL_ENCODING =".mb_internal_encoding()."<br />\n";

define('IDENTIFIER_OLE',
pack("CCCCCCCC",0xd0,0xcf,0x11,0xe0,0xa1,0xb1,0x1a,0xe1));

$data = file_get_contents($_SERVER['DOCUMENT_ROOT'].'/substr_bug/emptyfile.xls');

echo "Data length = ".strlen($data)."<br />\n";
echo "First 8 symbols  ==>".var_export(substr($data,0,8),1)."<== <br
/>\n";
echo "Compare result (substr(\$data,0,8)==IDENTIFIER_OLE) -
".var_export(substr($data,0,8)==IDENTIFIER_OLE,1)."<br />\n";
echo "Substring length (substr(\$data,0,8)) -
".strlen(substr($data,0,8))."<br />\n";

?>

Expected result:
----------------
function overload = 0
MB_INTERNAL_ENCODING =ISO-8859-1
Data length = 13824
First 8 symbols ==>'&#1087;&#1086;&#1070;&#9553;&#9568;&#1040;'<==
Compare result (substr($data,0,8)==IDENTIFIER_OLE) - true
Substring length (substr($data,0,8)) - 8


Actual result:
--------------
// This result can be seen if mb_internal_encoding is set to UTF-8

function overload = 0
MB_INTERNAL_ENCODING =UTF-8
Data length = 13824
First 8 symbols ==>'&#1087;&#1086;&#1070;&#9553;&#9568;&#1040;' . "\0"
. '' . "\0" . '' . "\0" . '' . "\0" . '' . "\0" . ''<==
Compare result (substr($data,0,8)==IDENTIFIER_OLE) - false
Substring length (substr($data,0,8)) - 13


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-07-31 17:14 UTC] bugs dot php dot net at jeka dot ru
&#1058;&#1072;&#1078;&#1077; &#1089;&#1072;&#1084;&#1072;&#1103; &#1087;&#1088;&#1086;&#1073;&#1083;&#1077;&#1084;&#1072;.
&#1055;&#1088;&#1080; &#1091;&#1089;&#1090;&#1072;&#1085;&#1086;&#1074;&#1083;&#1077;&#1085;&#1085;&#1086;&#1084; mb_internal_encoding('UTF-8');

substr($string, 0, 2) &#1089;&#1083;&#1091;&#1095;&#1072;&#1081;&#1085;&#1099;&#1084; &#1086;&#1073;&#1088;&#1072;&#1079;&#1086;&#1084; &#1074;&#1086;&#1079;&#1074;&#1088;&#1072;&#1097;&#1072;&#1077;&#1090;, &#1090;&#1086; 2 &#1073;&#1072;&#1081;&#1090;&#1072;, &#1090;&#1086; 4.
C&#1090;&#1088;&#1086;&#1082;&#1072; &#1074; utf8.


Linux 2.6.24-gentoo-r4 #2 SMP
PHP 5.2.6-pl2-gentoo 
mbstring.func_overload = 0
 [2008-08-01 22:25 UTC] moriyoshi@php.net
Please try using this CVS snapshot:

  http://snaps.php.net/php5.2-latest.tar.gz
 
For Windows (zip):
 
  http://snaps.php.net/win32/php5.2-win32-latest.zip

For Windows (installer):

  http://snaps.php.net/win32/php5.2-win32-installer-latest.msi

I could not reproduce the problem. Do you have some php_value or php_admin_value either in your httpd.conf or in .htaccess?


 [2008-08-09 01:00 UTC] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
 [2008-09-09 13:29 UTC] yar_helg at mail dot ru
In 5.2.7-dev version this bug is not reproduced.
Thank you.
 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Sun Nov 29 05:01:23 2020 UTC