php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #76909 preg_match difference between 7.3 and < 7.3
Submitted: 2018-09-20 20:41 UTC Modified: 2018-09-21 11:12 UTC
From: jay at diablomedia dot com Assigned:
Status: Closed Package: PCRE related
PHP Version: 7.3.0RC1 OS: Linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: jay at diablomedia dot com
New email:
PHP Version: OS:

 

 [2018-09-20 20:41 UTC] jay at diablomedia dot com
Description:
------------
I was testing a fork of Zend Framework 1 that we maintain for our legacy applications against php 7.3.0rc1 and noticed the Zend_Validate component's tests were failing on php 7.3: https://travis-ci.org/diablomedia/zf1-validate/jobs/431170057

In this case, php 7.3 is allowing some hostnames through validation that are being rejected in php < 7.3.

I dug a bit, and was able to reproduce this with a single regular expression that is evaluated differently in php 7.3 than it is in php 5.6 through php 7.2.

Here's a link to 3v4l.org demonstrating the difference (script is in the "Test script" section of this report): https://3v4l.org/DR3S5

Test script:
---------------
<?php

$domainPart = ' domain.com';

$regexChar = '/^[\x{0100}-\x{017f}]{1,63}$/iu';

preg_match($regexChar, $domainPart, $matches);

print_r($matches);

Expected result:
----------------
I would expect the $matches array in the test script to be empty here, similar to all PHP versions before 7.3.

Actual result:
--------------
The regex matches on " domain.com", which it doesn't in earlier versions of PHP.

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2018-09-20 21:08 UTC] requinix@php.net
Looks like a PCRE bug because turning off JIT makes it work correctly again. So that's the workaround.
ini_set("pcre.jit", 0);

PHP 7.3 is currently bundling PCRE 10.31. Version 10.32-RC1 includes a fix for something that looks related.
> 35. In a pattern such as /[^\x{100}-\x{ffff}]*[\x80-\xff]/ which has a repeated
> negative class with no characters less than 0x100 followed by a positive class
> with only characters less than 0x100, the first class was incorrectly being
> auto-possessified, causing incorrect match failures.
https://www.pcre.org/changelog.txt
https://bugs.exim.org/show_bug.cgi?id=2300

While your example doesn't have the negated set, the fix for the bug was very simple so it could affect less complicated situations like yours as well.
 [2018-09-20 21:24 UTC] nikic@php.net
This bug still reproduces on master, which uses PCRE 10.32, so it was not fixed as part of that release.
 [2018-09-20 21:53 UTC] jay at diablomedia dot com
I filed a bug with PCRE: https://bugs.exim.org/show_bug.cgi?id=2321

I was able to confirm that disabling PCRE JIT does fix the issue.
 [2018-09-20 21:56 UTC] cmb@php.net
-Package: *Regular Expressions +Package: PCRE related
 [2018-09-20 23:29 UTC] requinix@php.net
-Status: Open +Status: Not a bug
 [2018-09-20 23:29 UTC] requinix@php.net
Marking NAB and I'll watch what happens with that report.
 [2018-09-21 06:21 UTC] nikic@php.net
-Status: Not a bug +Status: Re-Opened
 [2018-09-21 06:21 UTC] nikic@php.net
Reopening to track the update of the bundled libpcre2.
 [2018-09-21 07:57 UTC] requinix@php.net
For the record, I can reproduce with pcre2test. (The backslash is required to escape the leading space.)

  re> /^[\x{0100}-\x{017f}]{1,63}$/i,utf,jit=0
data> \ domain.com
No match
data>
  re> /^[\x{0100}-\x{017f}]{1,63}$/i,utf,jit=1
data> \ domain.com
 0:  domain.com
 [2018-09-21 13:59 UTC] ab@php.net
Automatic comment on behalf of ab
Revision: http://git.php.net/?p=php-src.git;a=commit;h=7a02ecb7fef994332d83dae56ac5584d536f3de0
Log: Fixed bug #76909 preg_match difference between 7.3 and &lt; 7.3
 [2018-09-21 13:59 UTC] ab@php.net
-Status: Re-Opened +Status: Closed
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Dec 03 17:01:29 2024 UTC