php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #37661 str_split not working in utf8 environment
Submitted: 2006-06-01 08:19 UTC Modified: 2006-06-01 08:45 UTC
Votes:4
Avg. Score:4.8 ± 0.4
Reproduced:4 of 4 (100.0%)
Same Version:2 (50.0%)
Same OS:1 (25.0%)
From: frank at cleverbridge dot com Assigned:
Status: Wont fix Package: mbstring related
PHP Version: 5.1.4 OS: Linux 2.6.12-1.1381_FC3 #1
Private report: No CVE-ID: None
 [2006-06-01 08:19 UTC] frank at cleverbridge dot com
Description:
------------
php.ini:

...
mbstring.func_overload=6;
mbstring.internal_encoding=UTF-8;
mbstring.http_input = auto;
mbstring.detect_order = ISO-8859-1,UTF-8;
mbstring.encoding_translation = On;
...

The function str_split does not work correctly with characters  >1 byte.

in my testscript the katakana character is submitted by a web form to php. in order to make the reproduce code as easy as possible i just copied the katakana symbol into the code.


Reproduce code:
---------------
$foo = '入';

print "StrLength: ".strlen($foo)."\n";

$I = str_split($foo);

print "array size: ".sizeof($I)."\n";
print_r($I);

Expected result:
----------------
StrLength: 1
array size: 1

Array
(
    [0] => 入;
)


Actual result:
--------------
StrLength: 1
array size: 3

Array
(
    [0] => �
    [1] => �
    [2] => �
)


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-06-01 08:45 UTC] tony2001@php.net
Well, that's easy:
mbstring doesn't overload str_split() and never did.
So you have to wait for PHP 6 to get proper Unicode support.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Thu Jan 02 19:01:28 2025 UTC