php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #76127 preg_split does not raise an error on invalid UTF-8
Submitted: 2018-03-21 13:36 UTC Modified: 2018-03-21 15:32 UTC
From: graefrath at femu dot rwth-aachen dot de Assigned:
Status: Closed Package: PCRE related
PHP Version: 5.6.34 OS:
Private report: No CVE-ID: None
 [2018-03-21 13:36 UTC] graefrath at femu dot rwth-aachen dot de
Description:
------------
preg_match and preg_replace return false and set the last error to PREG_BAD_UTF8_ERROR when the u modifier is specified and the subject is invalid UTF-8. However, preg_split does not behave consistently in this regard.

Test script:
---------------
var_dump(preg_split("/a/u", "a\xff"));

Expected result:
----------------
preg_split should return false and set the last error PREG_BAD_UTF8_ERROR just like the other preg_ functions.

Actual result:
--------------
preg_split returns an array with the subject string.

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2018-03-21 15:32 UTC] cmb@php.net
-Status: Open +Status: Verified
 [2018-03-21 15:32 UTC] cmb@php.net
Confirmed: <https://3v4l.org/1WYQE>.

The problem is that although the error is detected[1],
php_pcre_split_impl() happily proceeds, while
php_pcre_match_impl() checks for an error in the end, and returns
FALSE in that case[2].

[1] <https://github.com/php/php-src/blob/PHP-7.2.4/ext/pcre/php_pcre.c#L2369>
[2] <https://github.com/php/php-src/blob/PHP-7.2.4/ext/pcre/php_pcre.c#L1134-L1139>
 [2019-03-19 12:59 UTC] nikic@php.net
Automatic comment on behalf of nikita.ppv@gmail.com
Revision: http://git.php.net/?p=php-src.git;a=commit;h=661bce47aebdc67bda1616e1b6979803765173a6
Log: Fixed bug #76127
 [2019-03-19 12:59 UTC] nikic@php.net
-Status: Verified +Status: Closed
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Wed Jan 22 19:01:31 2025 UTC