php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #80118 Enabled pcre.jit gives different matches results than disabled pcre.jit
Submitted: 2020-09-17 17:26 UTC Modified: 2020-09-19 10:45 UTC
From: zhiyangleecn at gmail dot com Assigned: cmb (profile)
Status: Closed Package: PCRE related
PHP Version: 7.4.10 OS: Any
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: zhiyangleecn at gmail dot com
New email:
PHP Version: OS:

 

 [2020-09-17 17:26 UTC] zhiyangleecn at gmail dot com
Description:
------------
The regular expression ~[^/p{Han}/p{Z}]~u will match a single character not present in the list below:
 -Any characters in the Han script.
 -Any kind of whitespace or invisible separator

In fact, when pcre.jit is enabled, this regular expression will match a whitespace, which may be an incorrect match.

pcre:

PCRE (Perl Compatible Regular Expressions) Support => enabled
PCRE Library Version => 10.34 2019-11-21
PCRE Unicode Version => 12.1.0
PCRE JIT Support => enabled
PCRE JIT Target => x86 64bit (little endian + unaligned)

Directive => Local Value => Master Value
pcre.backtrack_limit => 1000000 => 1000000
pcre.jit => 1 => 1
pcre.recursion_limit => 100000 => 100000



Test script:
---------------
pcre_jit_off.php:
<?php
ini_set('pcre.jit', 0);
preg_match('~[^\p{Han}\p{Z}]~u', '     ', $matches);
var_dump($matches);

pcre_jit_on.php:
<?php
ini_set('pcre.jit', 1);
preg_match('~[^\p{Han}\p{Z}]~u', '     ', $matches);
var_dump($matches);

Expected result:
----------------
pcre_jit_off.php: 
array(0) {
}

pcre_jit_on.php:
array(0) {
}

Actual result:
--------------
pcre_jit_off.php:
array(0) {
}

pcre_jit_on.php:
array(1) {
  [0]=>
  string(1) " "
}

Patches

Pull Requests

Pull requests:

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-09-17 17:44 UTC] danack@php.net
-Package: PCRE related +Package: JIT
 [2020-09-17 18:42 UTC] requinix@php.net
-Status: Open +Status: Duplicate -Package: JIT +Package: PCRE related
 [2020-09-17 18:42 UTC] requinix@php.net
PCRE's JIT. Not PHP's JIT.

Likely a duplicate of bug #79363 that is fixed in PCRE 10.35, PHP 8.0, and will be fixed in PHP 7.4 when possible.
 [2020-09-17 21:45 UTC] danack@php.net
"PCRE's JIT. Not PHP's JIT." 

ugh, my bad. original package was right then.
 [2020-09-17 22:11 UTC] cmb@php.net
-Status: Duplicate +Status: Open
 [2020-09-17 22:11 UTC] cmb@php.net
> Likely a duplicate of bug #79363 […]

Unfortunately not: <https://3v4l.org/eqms1>.

I presume this is a libpcre2 issue (possibly already fixed in the
development branch).

> […] that is fixed in PCRE 10.35, and will be fixed in PHP 7.4 when possible.

That has just happened:
<https://github.com/php/php-src/commit/9f2d03952daf5348d7f18c18e82607979e28df12>.
 [2020-09-18 08:52 UTC] cmb@php.net
-Status: Open +Status: Suspended -Assigned To: +Assigned To: cmb
 [2020-09-18 08:52 UTC] cmb@php.net
This is apparently an unresolved regression in libpcre2.  I have
filed <https://bugs.exim.org/show_bug.cgi?id=2644>, and suspend
this ticket for the time being.
 [2020-09-19 10:45 UTC] cmb@php.net
-Status: Suspended +Status: Analyzed
 [2020-09-19 10:45 UTC] cmb@php.net
The upstream issue has already been resolved.
 [2020-09-19 10:46 UTC] cmb@php.net
The following pull request has been associated:

Patch Name: Fix #80118: Erroneous whitespace match with JIT only
On GitHub:  https://github.com/php/php-src/pull/6165
Patch:      https://github.com/php/php-src/pull/6165.patch
 [2020-09-21 08:30 UTC] cmb@php.net
Automatic comment on behalf of cmbecker69@gmx.de
Revision: http://git.php.net/?p=php-src.git;a=commit;h=d27dc5c028637edaf4d1511613cc33e7e164bd6e
Log: Fix #80118: Erroneous whitespace match with JIT only
 [2020-09-21 08:30 UTC] cmb@php.net
-Status: Analyzed +Status: Closed
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 15:01:30 2024 UTC