php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #31324 function strlen miscounts
Submitted: 2004-12-28 18:36 UTC Modified: 2004-12-30 15:32 UTC
From: phpbug at tore dot cc Assigned:
Status: Not a bug Package: Strings related
PHP Version: 5.0.2 OS: Solaris 10
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: phpbug at tore dot cc
New email:
PHP Version: OS:

 

 [2004-12-28 18:36 UTC] phpbug at tore dot cc
Description:
------------
I have discoverd that a large amount of the php-scripts are using 'strlen' to count length of arbitrary strings.
strlen can not cope with arbitrary strings!

Code like will only show garbage values:
$a = fread($binaryfile,$count);
print strlen($a) . " number of bytes is read from 
                                the binary file";


Reproduce code:
---------------
<?php
$s='';
print "<html>\n";
for($i=142;$i<256;$i++){
        $s.=pack("C",$i);
        $result = count_chars($s, 0);
        $l=0;
        for ($j=0; $j < 256; $j++) {
                $l+=$result[$j];
        }
        print "<br>The variable s has the length: " .
                strlen($s) . " (actual length) " . $l . "</br>\n";
}
print "</html>\n";
?>

Expected result:
----------------
The variable s has the length: 1 (actual length) 1
The variable s has the length: 1 (actual length) 2
The variable s has the length: 2 (actual length) 3
The variable s has the length: 3 (actual length) 4
The variable s has the length: 4 (actual length) 5
The variable s has the length: 5 (actual length) 6
The variable s has the length: 6 (actual length) 7
The variable s has the length: 7 (actual length) 8
The variable s has the length: 8 (actual length) 9
The variable s has the length: 9 (actual length) 10
The variable s has the length: 10 (actual length) 11
The variable s has the length: 11 (actual length) 12
.
.
.
The variable s has the length: 64 (actual length) 111
The variable s has the length: 65 (actual length) 112
The variable s has the length: 65 (actual length) 113
The variable s has the length: 66 (actual length) 114

Actual result:
--------------
The variable s has the length: 1 (actual length) 1
The variable s has the length: 2 (actual length) 2
The variable s has the length: 3 (actual length) 3
The variable s has the length: 4 (actual length) 4
The variable s has the length: 5 (actual length) 5
The variable s has the length: 6 (actual length) 6
The variable s has the length: 7 (actual length) 7
The variable s has the length: 8 (actual length) 8
The variable s has the length: 9 (actual length) 9
The variable s has the length: 10 (actual length) 10
The variable s has the length: 11 (actual length) 11
The variable s has the length: 12 (actual length) 12
.
.
.
The variable s has the length: 111 (actual length) 111
The variable s has the length: 112 (actual length) 112
The variable s has the length: 113 (actual length) 113
The variable s has the length: 114 (actual length) 114

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2004-12-28 18:38 UTC] phpbug at tore dot cc
My configuration:
./configure  --with-apxs=/opt/csw/apache/bin/apxs --with-mod_charset --with-openssl=/opt/csw --enable-ftp --with-imap-ssl --with-jpeg=/usr/csw/lib --with-mysql=/opt/sfw/mysql --with-pgsql --enable-mbstr-enc-trans --enable-mbstring --enable-mbregex --with-tiff=/usr/sfc/lib --enable-inline-optimization --with-imap=/mnt/slask/src/imap-2002 --with-java=/usr/j2se --with-ming=/usr/local --with-orc8=/opt/oracle
--with-gettext --with-xml --with-dom --with-zlib --enable-cli --enable-zend-multibyte --with-mod_charset --enable-gd-jis-conv --disable-libxml
 [2004-12-28 19:19 UTC] derick@php.net
I don't understand what your script is doing, please come up with a trivial script instead of those loops.

 [2004-12-28 21:57 UTC] phpbug at tore dot cc
How about:
<?php
$s=pack("C",143);
$result = count_chars($s, 0);
print "The variable s has the length: " .
           strlen($s) . " (actual length) " . $result[143] .  
           "\n";
?>
Gives printout:
The variable s has the length: 0 (actual length) 1

The 'pack' creates a bit-string with a length of 1 char.
The 'strlen' should should recognize it as a char with length 1. As you can see below 'strlen' does not only miscalculates the string length for the char 143.
 [2004-12-28 23:07 UTC] derick@php.net
Please try using this CVS snapshot:

  http://snaps.php.net/php5-STABLE-latest.tar.gz
 
For Windows:
 
  http://snaps.php.net/win32/php5.0-win32-latest.zip

Both in 4.3.10RC2, 5.0.3-dev, and 5.1.0-dev it prints as expected:
The variable s has the length: 1 (actual length) 1
So please try a snapshot.
 [2004-12-28 23:40 UTC] phpbug at tore dot cc
Sorry, my mistake!
It seems to work on one char long strings.
I made a better script (still avoiding loops...).

<?php
$s=pack("C*",0x8e,0x8f);
$t=pack("C*",0x8e);
$u=pack("C*",0x8e,0x8f,0x90);
print "The variable s has the length: " . strlen($s) . "\n";
print "The variable t has the length: " . strlen($t) . "\n";
print "The variable u has the length: " . strlen($u) . "\n";
?>


Result:
The variable s has the length: 1 
The variable t has the length: 1 
The variable u has the length: 2

The result should be:
The variable s has the length: 2 
The variable t has the length: 1 
The variable u has the length: 3

Since s=2 chars, t=1 char and u 3 chars long.

Lets try another test with 7 bit chars.

<?php
$s=pack("C*",0x30,0x31);
$t=pack("C*",0x30);
$u=pack("C*",0x30,0x31,0x32);
print "The variable s has the length: " . strlen($s) . "\n";
print "The variable t has the length: " . strlen($t) . "\n";
print "The variable u has the length: " . strlen($u) . "\n";
?>

Result:
The variable s has the length: 2 
The variable t has the length: 1 
The variable u has the length: 3

As expected. No problem with low values on the chars.
 [2004-12-28 23:59 UTC] derick@php.net
It still works fine for me on Linux. Which compiler did you use to compile PHP (gcc or sun workshop), and what kind of platform do you use? intel (32/64), sparc...
 [2004-12-29 07:34 UTC] phpbug at tore dot cc
I have used gcc2 and gcc3 to compile.
The machine is a sparc.
The problem occurs on solaris 9 and solaris 10.
If you suspect that this is a solaris problem, can you extract the code for 'strlen' in php. I could make some tests in C....
 [2004-12-29 19:13 UTC] phpbug at tore dot cc
I might add that I tried php-5.0.3 on an intel-based solaris 10 machine and there were no problems with the strlen.
 [2004-12-29 21:42 UTC] iliaa@php.net
Sounds like either a compiler or libc bug. I cannot replicate this on my systems either.
 [2004-12-29 23:22 UTC] phpbug at tore dot cc
Are 64bit processors not supported by php?
Which machine architecture is 'my system'?

The php is compiled with gcc2 AND gcc3. With solaris 9 AND solaris 10. The same trouble occurs.
So I need to have the code for 'strlen' in php extracted before I can figure out where the source for the trouble is.

A simple C-program using string.h's strlen cause no troule at all.
 [2004-12-30 10:55 UTC] derick@php.net
It's impossible to extract that code from PHP, and as we don't have access to a solaris machine we can not really debug this.
 [2004-12-30 14:33 UTC] phpbug at tore dot cc
I have traced down the trouble.
When multibyte strings support is complied in to php the internal function 'strlen' is aliased to the function 'mb_strlen'.
mb_strlen counts the number of characters in the binary string depending of what default encoding is used.

I have set the default chars to UTF-8 and SJIS (japanese) on my sparc machine. That is why I only see it there.

Please remove the aliasing between strlen and mb_strlen!
 [2004-12-30 14:44 UTC] rasmus@php.net
Did you read the documentation?

; overload(replace) single byte functions by mbstring functions.
; mail(), ereg(), etc are overloaded by mb_send_mail(), mb_ereg(),
; etc. Possible values are 0,1,2,4 or combination of them.
; For example, 7 for overload everything.
; 0: No overload
; 1: Overload mail() function
; 2: Overload str*() functions
; 4: Overload ereg*() functions
;mbstring.func_overload = 0
 [2004-12-30 14:53 UTC] phpbug at tore dot cc
I don't count 'php.ini' to documentation. Anyway thanks for the time.
 [2004-12-30 15:32 UTC] rasmus@php.net
There is no bug here, hence the bogus status.

Here is the reference in the docs linked from php.net/mbstring:

http://php.net/manual/en/ref.mbstring.php#mbstring.overload

Does that count as documentation?
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Apr 28 05:01:30 2024 UTC