php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #69776 mb_strlen() produces different results from previous PHP versions
Submitted: 2015-06-08 19:47 UTC Modified: 2015-06-21 04:31 UTC
From: martijn at vanderlee dot com Assigned:
Status: Not a bug Package: mbstring related
PHP Version: 5.6.9 OS: Any
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: martijn at vanderlee dot com
New email:
PHP Version: OS:

 

 [2015-06-08 19:47 UTC] martijn at vanderlee dot com
Description:
------------
mb_strlen() on PHP 5.6 sometimes returns different values than the exact same call on previous versions of PHP (tested with 5.3, 5.4 and 5.5).

For the given test script, PHP 5.6 returns "17" whereas all other versions return "31".

It seems 5.6 is correct, however, when using the mb_ereg_search_* functions, they still return match lengths like PHP 5.5 and before. i.e. they total up to 31 for the entire string.

Regardless of which version is right, all multibyte functions should atleast produce consistent results within the same version of PHP.

Test script:
---------------
echo mb_strlen('από το Άξιον Εστί');


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2015-06-08 20:14 UTC] requinix@php.net
-Status: Open +Status: Feedback
 [2015-06-08 20:14 UTC] requinix@php.net
Thank you for this bug report. To properly diagnose the problem, we
need a short but complete example script to be able to reproduce
this bug ourselves. 

A proper reproducing script starts with <?php and ends with ?>,
is max. 10-20 lines long and does not require any external 
resources such as databases, etc. If the script requires a 
database to demonstrate the issue, please make sure it creates 
all necessary tables, stored procedures etc.

Please avoid embedding huge scripts into the report.

The default character encoding changed with PHP 5.6. You were/are relying on its default being (apparently) UTF-8 when that was not the case, thus the incorrect value of 31. Now the default is UTF-8 and you're getting 17.
https://wiki.php.net/rfc/default_encoding

As to the bug you're reporting, when I use mb_ereg_search_* I get 17 characters.
http://3v4l.org/JWP3O
 [2015-06-21 04:22 UTC] php-bugs at lists dot php dot net
No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Re-Opened". Thank you.
 [2015-06-21 04:31 UTC] yohgaki@php.net
-Status: No Feedback +Status: Not a bug
 [2015-06-21 04:31 UTC] yohgaki@php.net
The reason you get wrong result is that you haven't set correct char encoding. Use correct char encoding setting, then you'll get consistent result.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 17:01:58 2024 UTC