|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #79363 \p{L} doesn't work alongside \p{Arabic} in a character class
Submitted: 2020-03-10 13:37 UTC Modified: 2021-01-30 14:47 UTC
From: halaeiv at gmail dot com Assigned: cmb (profile)
Status: Closed Package: PCRE related
PHP Version: 7.4.14 OS: Ubuntu 18.04
Private report: No CVE-ID: None
 [2020-03-10 13:37 UTC] halaeiv at gmail dot com
I ran the following code in both PHP 7.2.28 and PHP 7.4.3. The output of PHP 7.4.3 is wrong. Basically, adding a \p{Arabic} after \p{L} seems to causes \p{L} to match uppercases and spaces, instead of letters.

Test script:
$str = 'lower UPPER';
echo(preg_replace('/[\p{L}\p{Arabic}]/', '0', $str));
echo "\n";
echo(preg_replace('/[^\p{L}\p{Arabic}]/', '0', $str));
echo "\n";

Expected result:
00000 00000

Actual result:
00000 UPPER


Add a Patch

Pull Requests

Pull requests:

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2020-03-11 08:04 UTC] halaeiv at gmail dot com
There is a related bug report in pcre:
The bug's status is new, but it is mentioned that it was fixed in 10.34.
I hope the next PHP version will fix it. There can be some additional tests to check.
Note: the preg works with (*NO_JIT).
 [2020-03-11 08:49 UTC]
-Status: Open +Status: Not a bug -Assigned To: +Assigned To: cmb
 [2020-03-11 08:49 UTC]
Indeed, this is not a bug in PHP, but rather in libpcre.  In
libpcre 10.34, the behavior has been improved, but is still not
quite right (with JIT enabled):

lower UPPER
 [2020-06-30 11:06 UTC] halaeiv at gmail dot com
PCRE 10.35 is released with the bug being fixed. But it is not still used in PHP 7.4 branch. So I believe this issue is now a PHP bug and should be reopened. Any plan to fix?
 [2020-06-30 12:49 UTC]
-Status: Not a bug +Status: Re-Opened
 [2020-06-30 12:49 UTC]
The following pull request has been associated:

Patch Name: Update to PCRE2 10.35
On GitHub:
 [2020-06-30 14:22 UTC]
-Status: Re-Opened +Status: Suspended -Assigned To: cmb +Assigned To:
 [2020-06-30 14:22 UTC]
The master branch has just been updated to PCRE2 10.35, so that
will be available as of PHP 8.0.0alpha2.

Updating for PHP 7.4 will be considered at the earliest in several
weeks; for the time being I'm suspending this ticket.
 [2020-09-17 12:10 UTC]
-Status: Suspended +Status: Closed -Assigned To: +Assigned To: cmb
 [2021-01-30 12:45 UTC] halaeiv at gmail dot com
-Operating System: Ubuntu +Operating System: Ubuntu 18.04 -PHP Version: 7.4.3 +PHP Version: 7.4.14
 [2021-01-30 12:45 UTC] halaeiv at gmail dot com
Hello again.
I have tested the following in 2 builds of PHP-7.4.14, one from Ubuntu 18.04 repository and another one build with phpbrew. Unfortunately the results are different. I think phpbrew is correct.

Test script:
echo(preg_replace('/[^\p{L}\p{Arabic}]/', '0', '<->'));

Expected result (phpbrew):

Actual result (ubuntu):

Should this be opened again or it needs to be reported elsewhere?
 [2021-01-30 12:58 UTC]
What's the value of PCRE_VERSION in both environments?
 [2021-01-30 13:24 UTC] halaeiv at gmail dot com
PCRE_VERSION in both is the same:
10.35 2020-05-09
 [2021-01-30 13:52 UTC]
Thanks for the swift reply!  It seems that Ubuntu didn't backport
the fix for bug #79846[1].

[1] <>
 [2021-01-30 14:15 UTC] halaeiv at gmail dot com
So is it going to be fixed in 7.4.15? If it isn't related to php-src, do you know where should it be reported? I have absolutely no idea about the build process or why they use a different version of pcre or anything than the one they should use.

Thanks for your help.
 [2021-01-30 14:47 UTC] halaeiv at gmail dot com
I sent an issue to
PHP Copyright © 2001-2023 The PHP Group
All rights reserved.
Last updated: Thu Sep 28 12:01:24 2023 UTC