php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #36983 preg_match_all incorrect due to backtrack length limit?
Submitted: 2006-04-05 12:41 UTC Modified: 2006-05-26 18:36 UTC
From: crisp at tweakers dot net Assigned: andrei (profile)
Status: Closed Package: PCRE related
PHP Version: 5.1.2, 4.4.2 OS: *
Private report: No CVE-ID: None
 [2006-04-05 12:41 UTC] crisp at tweakers dot net
Description:
------------
It seems that when PCRE needs to backtrack more than 20 characters in order to evaluate an OR'ed expression the results of a preg_match_all are incorrect/incomplete.
See below code as an example.

Reproduce code:
---------------
$foo = 'foo# bar# abcdefghijklmnopqrst bla#';
echo '$foo = ', $foo, "\n";

preg_match_all('/([a-y]|z)+#/', $foo, $matches1);
preg_match_all('/([a-y]+|z)+#/', $foo, $matches2);

print_r($matches1[0]);
print_r($matches2[0]);


$foo = 'foo# bar# abcdefghijklmnopqrstu bla#';
echo '$foo = ', $foo, "\n";

preg_match_all('/([a-y]|z)+#/', $foo, $matches1);
preg_match_all('/([a-y]+|z)+#/', $foo, $matches2);

print_r($matches1[0]);
print_r($matches2[0]);


Expected result:
----------------
both expressions for both strings should give the following output:

Array
(
    [0] => foo#
    [1] => bar#
    [2] => bla#
)

Actual result:
--------------
The last expression gives the following output on the last string:

Array
(
    [0] => foo#
    [1] => bar#
)

The last match on bla# is missing in the result for the preg_match_all

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-04-05 17:25 UTC] crisp at tweakers dot net
Indeed this seems to be an issue with PCRE; version 6.6 still has this problem so I'll try my luck with the PCRE team.
Notwithstanding I don't believe 'bogus' is the right status for this bug since it is an actual issue, just not with PHP (although afaik PHP still ships with PCRE version 6.4).
I'll just put this bug to 'closed'
 [2006-04-05 23:01 UTC] crisp at tweakers dot net
Just a small update on this. PCRE is actually issuing an error; specifically PCRE_ERROR_MATCHLIMIT (-8)
Apparently when PCRE goes into backtracking it will try any combination of the OR'ed subpattern resulting into over 10.000.000 calls to match(). Obviously there is no construction that first checks if (and where) a possible alternative defined as an OR can match within the part that is being backtracked.

So this seems to be defined behaviour of PCRE (but it could use improvement if even a simple case like the one I constructed already triggers this behaviour), but my question now is why does PHP not raise a warning or error when PCRE exits with one?
 [2006-05-26 18:36 UTC] nlopess@php.net
PHP 5.2 already has a preg_last_error() function. It was choosen to not emit a warning, but rather making them acessible though that function.

If you append the line below to your scipt, it will print true:
var_dump(preg_last_error() === PREG_BACKTRACK_LIMIT_ERROR);
 
PHP Copyright © 2001-2019 The PHP Group
All rights reserved.
Last updated: Fri Nov 15 12:01:34 2019 UTC