php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #78190 mb_split now return false with ISO strings
Submitted: 2019-06-20 15:09 UTC Modified: 2020-08-16 09:09 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:1 (100.0%)
From: mvachette at adequasys dot com Assigned:
Status: Re-Opened Package: mbstring related
PHP Version: 7.2.19 OS: Windows 10
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: mvachette at adequasys dot com
New email:
PHP Version: OS:

 

 [2019-06-20 15:09 UTC] mvachette at adequasys dot com
Description:
------------
It seems that something changes in 7.2.18 when using mb_split on ISO string.

On at least 7.2.16 & 7.2.17, using this function with ISO encoding string containing accents works fine.

On a new env with version 7.2.18, the same code now return "false" (new undocumented behaviour)



Test script:
---------------
var_dump(mb_split('-', 'e-a'));
//works fine on any version

var_dump(mb_split('-', 'é-a'));
// works if string is UTF8 encoded
// return "false" with an ISO encoded string on PHP 7.2.18


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-06-20 15:15 UTC] sjon@php.net
-Status: Open +Status: Feedback -Assigned To: +Assigned To: sjon
 [2019-06-20 15:15 UTC] sjon@php.net
I cannot reproduce this, it seems to work pretty consistently: https://3v4l.org/QqYjC
 [2019-06-20 15:21 UTC] nikic@php.net
-Status: Feedback +Status: Not a bug
 [2019-06-20 15:21 UTC] nikic@php.net
The default encoding is UTF-8, use mb_regex_encoding() if you are working with something else.
 [2019-06-20 15:30 UTC] mvachette at adequasys dot com
Ok, I figure to clear the error using mb_regex_encoding, thanks for the suggestion.

But is there any documentation available about this change of behaviour? For information, I found this burried into Smarty lib. I can ensure correct behaviour of mb_split on my app, by maybe I will not be the lone to be impacted.
 [2019-06-20 15:47 UTC] nikic@php.net
-Status: Not a bug +Status: Re-Opened -Type: Bug +Type: Documentation Problem
 [2019-06-20 15:47 UTC] nikic@php.net
Right, this should probably be documented. This change was backported in https://github.com/php/php-src/commit/0ecac37c40a27ffbd59f34b5920735ee0b7f994c#diff-3b307ca282b8a468c86220f768eea0d7 which is in 7.1.28, 7.2.18 and 7.3.5. For other mb_regex functions apart from mb_split and mb_ereg_match this was already checked previously, though I'm not sure since when.
 [2020-08-16 09:09 UTC] sjon@php.net
-Assigned To: sjon +Assigned To:
 [2022-12-27 11:20 UTC] Isaac1866Williams at gmail dot com
Hello,

In the case of failure to perform preg_split due to invalid inputs, preg_split will return false.
Example-

$string = array("sdasds"); // Invalid string
$array=preg_split('/[\s,]+/',$string);
var_dump($array); //false

$string = "sdasds";
$array=preg_split('j',$string); // Invalid pattern
var_dump($array); //false

(https://www.mycenturahealth.me/)github.com
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Sep 20 08:01:28 2024 UTC