|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #76909 preg_match difference between 7.3 and < 7.3
Submitted: 2018-09-20 20:41 UTC Modified: 2018-09-21 11:12 UTC
From: jay at diablomedia dot com Assigned:
Status: Closed Package: PCRE related
PHP Version: 7.3.0RC1 OS: Linux
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
Block user comment
Status: Assign to:
Bug Type:
From: jay at diablomedia dot com
New email:
PHP Version: OS:


 [2018-09-20 20:41 UTC] jay at diablomedia dot com
I was testing a fork of Zend Framework 1 that we maintain for our legacy applications against php 7.3.0rc1 and noticed the Zend_Validate component's tests were failing on php 7.3:

In this case, php 7.3 is allowing some hostnames through validation that are being rejected in php < 7.3.

I dug a bit, and was able to reproduce this with a single regular expression that is evaluated differently in php 7.3 than it is in php 5.6 through php 7.2.

Here's a link to demonstrating the difference (script is in the "Test script" section of this report):

Test script:

$domainPart = '';

$regexChar = '/^[\x{0100}-\x{017f}]{1,63}$/iu';

preg_match($regexChar, $domainPart, $matches);


Expected result:
I would expect the $matches array in the test script to be empty here, similar to all PHP versions before 7.3.

Actual result:
The regex matches on "", which it doesn't in earlier versions of PHP.


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2018-09-20 21:08 UTC]
Looks like a PCRE bug because turning off JIT makes it work correctly again. So that's the workaround.
ini_set("pcre.jit", 0);

PHP 7.3 is currently bundling PCRE 10.31. Version 10.32-RC1 includes a fix for something that looks related.
> 35. In a pattern such as /[^\x{100}-\x{ffff}]*[\x80-\xff]/ which has a repeated
> negative class with no characters less than 0x100 followed by a positive class
> with only characters less than 0x100, the first class was incorrectly being
> auto-possessified, causing incorrect match failures.

While your example doesn't have the negated set, the fix for the bug was very simple so it could affect less complicated situations like yours as well.
 [2018-09-20 21:24 UTC]
This bug still reproduces on master, which uses PCRE 10.32, so it was not fixed as part of that release.
 [2018-09-20 21:53 UTC] jay at diablomedia dot com
I filed a bug with PCRE:

I was able to confirm that disabling PCRE JIT does fix the issue.
 [2018-09-20 21:56 UTC]
-Package: *Regular Expressions +Package: PCRE related
 [2018-09-20 23:29 UTC]
-Status: Open +Status: Not a bug
 [2018-09-20 23:29 UTC]
Marking NAB and I'll watch what happens with that report.
 [2018-09-21 06:21 UTC]
-Status: Not a bug +Status: Re-Opened
 [2018-09-21 06:21 UTC]
Reopening to track the update of the bundled libpcre2.
 [2018-09-21 07:57 UTC]
For the record, I can reproduce with pcre2test. (The backslash is required to escape the leading space.)

  re> /^[\x{0100}-\x{017f}]{1,63}$/i,utf,jit=0
data> \
No match
  re> /^[\x{0100}-\x{017f}]{1,63}$/i,utf,jit=1
data> \
 [2018-09-21 13:59 UTC]
Automatic comment on behalf of ab
Log: Fixed bug #76909 preg_match difference between 7.3 and &lt; 7.3
 [2018-09-21 13:59 UTC]
-Status: Re-Opened +Status: Closed
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Mon Mar 04 04:01:28 2024 UTC