php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #78190 mb_split now return false with ISO strings
Submitted: 2019-06-20 15:09 UTC Modified: 2020-08-16 09:09 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:1 (100.0%)
From: mvachette at adequasys dot com Assigned:
Status: Re-Opened Package: mbstring related
PHP Version: 7.2.19 OS: Windows 10
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: mvachette at adequasys dot com
New email:
PHP Version: OS:

 

 [2019-06-20 15:09 UTC] mvachette at adequasys dot com
Description:
------------
It seems that something changes in 7.2.18 when using mb_split on ISO string.

On at least 7.2.16 & 7.2.17, using this function with ISO encoding string containing accents works fine.

On a new env with version 7.2.18, the same code now return "false" (new undocumented behaviour)



Test script:
---------------
var_dump(mb_split('-', 'e-a'));
//works fine on any version

var_dump(mb_split('-', 'é-a'));
// works if string is UTF8 encoded
// return "false" with an ISO encoded string on PHP 7.2.18


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-06-20 15:15 UTC] sjon@php.net
-Status: Open +Status: Feedback -Assigned To: +Assigned To: sjon
 [2019-06-20 15:15 UTC] sjon@php.net
I cannot reproduce this, it seems to work pretty consistently: https://3v4l.org/QqYjC
 [2019-06-20 15:21 UTC] nikic@php.net
-Status: Feedback +Status: Not a bug
 [2019-06-20 15:21 UTC] nikic@php.net
The default encoding is UTF-8, use mb_regex_encoding() if you are working with something else.
 [2019-06-20 15:30 UTC] mvachette at adequasys dot com
Ok, I figure to clear the error using mb_regex_encoding, thanks for the suggestion.

But is there any documentation available about this change of behaviour? For information, I found this burried into Smarty lib. I can ensure correct behaviour of mb_split on my app, by maybe I will not be the lone to be impacted.
 [2019-06-20 15:47 UTC] nikic@php.net
-Status: Not a bug +Status: Re-Opened -Type: Bug +Type: Documentation Problem
 [2019-06-20 15:47 UTC] nikic@php.net
Right, this should probably be documented. This change was backported in https://github.com/php/php-src/commit/0ecac37c40a27ffbd59f34b5920735ee0b7f994c#diff-3b307ca282b8a468c86220f768eea0d7 which is in 7.1.28, 7.2.18 and 7.3.5. For other mb_regex functions apart from mb_split and mb_ereg_match this was already checked previously, though I'm not sure since when.
 [2020-08-16 09:09 UTC] sjon@php.net
-Assigned To: sjon +Assigned To:
 [2022-12-27 11:20 UTC] Isaac1866Williams at gmail dot com
Hello,

In the case of failure to perform preg_split due to invalid inputs, preg_split will return false.
Example-

$string = array("sdasds"); // Invalid string
$array=preg_split('/[\s,]+/',$string);
var_dump($array); //false

$string = "sdasds";
$array=preg_split('j',$string); // Invalid pattern
var_dump($array); //false

(https://www.mycenturahealth.me/)github.com
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Mon Nov 25 10:01:32 2024 UTC