php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #40846 pcre.backtrack_limit too restrictive
Submitted: 2007-03-17 19:19 UTC Modified: 2011-07-03 19:25 UTC
Votes:80
Avg. Score:4.7 ± 0.6
Reproduced:77 of 77 (100.0%)
Same Version:39 (50.6%)
Same OS:57 (74.0%)
From: crisp at xs4all dot nl Assigned: felipe (profile)
Status: Closed Package: PCRE related
PHP Version: 5.3 OS: *
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: crisp at xs4all dot nl
New email:
PHP Version: OS:

 

 [2007-03-17 19:19 UTC] crisp at xs4all dot nl
Description:
------------
The new pcre.backtrack_limit configuration directive is by default too restrictive (100.000) which results in failure of many - often quite simple - regular expressions.

I take it that this directive overrules the default setting for MATCH_LIMIT in PCRE which will also imply that the naming of this directive is wrong since MATCH_LIMIT is for every match() call in PCRE, not only those for backtracking.

It would make more sense to change the default of both pcre.backtrack_limit and pcre.recursion_limit to the ones that PCRE itself also supplies (10.000.000 for both) - that way there won't be a compatibility problem with previous versions of PHP as we have now.

Reproduce code:
---------------
$a = 'baab' . str_repeat('a', 100024);
$b = preg_replace('/b.*b/', '', $a);

Expected result:
----------------
I would expect $b to contain 100024 times 'a'

Actual result:
--------------
$b is a nullpointer, preg_last_error() reports '2' which is PREG_BACKTRACK_LIMIT_ERROR

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2007-03-17 19:26 UTC] tony2001@php.net
>that way there won't be a compatibility problem with previous
>versions of PHP as we have now.
Changing the limit doesn't mean removing the limit.
 [2007-03-17 20:16 UTC] crisp at xs4all dot nl
>Changing the limit doesn't mean removing the limit.
But if you change the default limits to match the defaults limits set in PCRE internally you won't affect it's behavior compared to previous versions of PHP where the internal settings in PCRE were not overridden.

Either that or don't override PCRE's internal settings unless these directives are explicitly set and enabled in php.ini (at the moment these directives are commented in the php.ini samples).
 [2007-05-19 06:49 UTC] tigr at mail15 dot com
For me this new behaviour have broken my templates system. While some of regexpes where simplified, others could not be done so. In some situations increasing these numbers of little help. For instance(the regexp was simplified greatly, in real-life application it is much more complex):

<?php

echo preg_replace('/\$[a-z]+([a-z]*(?:\[[a-z]*\])?)*/i'
, 'replaced', '$abc $something[something]');

echo var_dump(preg_last_error());

?>

Expected result - 'replaced replaced replaced'. Actual result - nothing, NULL returned, preg_last_error() shows that there is PREG_BACKTRACK_LIMIT_ERROR error. Also increasing backtrack limit leads to another error, PREG_RECURSION_LIMIT_ERROR. Increasing recursion limit leads to php hanging up.

Changing first or second asterisk in pattern to plus sign immediately fixes the problem, but I need it in this way. Also, do you think that this is a correct behaviour? I thing there is a bug somewhere that way.

However, this works pretty well on php 4.x, 5.x and even at 5.2.1 (at one of the hosts), but it does not work on my local php5.2.2 on WinXPsp2.
 [2007-05-19 08:32 UTC] tigr at mail15 dot com
Sorry, little mistake: expected result not 'replaced replaced replaced', but 'replaced replaced'.
 [2007-05-20 10:49 UTC] nlopess@php.net
we simply can't increase recursion limit or we risk segfaulting php. increase the backtrack limit is also risky, but is much safer (although regexes with much backtracking are usually not well written). I'll think more about this..
 [2007-05-20 11:09 UTC] tigr at mail15 dot com
It is kinda strange: previous versions work pretty nice, swiftly executing all patterns. And in some situations (as I mentioned before) increasing recursion and backtrack limits just won't help. I suppose it's wrong behaviour.

Also, note that examples are pretty short and simple. Increasing both limits to 1 000 000 does not help - just why?
 [2007-05-20 11:22 UTC] crisp at xs4all dot nl
PHP 5.2.0 includes an update of the PCRE library (version 6.7), so some problems may not be totally due to the restrictive limits of the PCRE settings in PHP but could be a bug/regression in PCRE itself.

PCRE has always been very poor in internal optimisation of expressions that contain look-aheads or look-behinds, especially when they are combined with some OR'ed subexpression. It's backtracking mechanism is quite simplistic and doesn't rule out execution paths that are sure not to result in a match - in fact, it doesn't have any sort of execution planner.
 [2007-08-16 15:58 UTC] brandon at invisionpower dot com
Installations of 5.2 are causing this issue for us with relatively simple regex.  Users can submit an arbitrary amount of data (I work on a bulletin board software) that is parsed out for bbcode tags.  Even simple expressions are causing problems.

			$txt = preg_replace_callback( "#\[code\](.+?)\[/code\]#is", array( &$this, 'regex_code_tag' ), $txt );

var_dump( preg_last_error() );

The callback function is not being hit at all and the output is int(2) (backtrack limit hit).  Increasing backtrack limit to 1,000,000 "resolves" the issue, but with shared hosting plans it's unrealistic to expect hosts to make php.ini changes to allow this.

I agree with crisp - the limit in PHP should bet set to the internal PCRE options, with the php.ini settings allowing you to reduce them if you wish.  PHP should not arbitrarily reduce them.
 [2007-08-16 19:00 UTC] drnick at physics dot byu dot edu
I just wanted to throw out that I completely agree with crisp.  We recently updated PHP on our webserver to 5.2.3 and had issues with our template system on input sizes of a certain size (>100K).

The idea of allowing PHP to enforce limits on the PCRE library is fine, but having the default action (when recursion_limit and backtrack_limit are commented-out) be PHP overriding PCRE's internal limits is VERY problematic.  Aside from being incredibly anti backwards-compatible, I believe PHP should not make arbitrary assumptions such as these.

I believe PHP should be changed so that the default action is to make use of PCRE's internal limits, and if people want to enforce their own, they can make that decision via the options. Perhaps modify php.ini-recommended to explain the options and say why having external limits can be good.
 [2007-12-10 14:18 UTC] daan at react dot nl
This issue is still a problem, plus this low setting is also the cause of segfaults.
(see http://bugs.php.net/bug.php?id=43031)

At the moment even this simple regexp segfaults 5.2.5:
preg_match('/(.)*/', str_repeat('x', 6000));

I hope that is not intended behavior, as is suggested in the reply in bug report 43031.
 [2009-08-28 08:54 UTC] tom at r dot je
This is still an issue in 5.3.

The low limit causes scripts which hit it to fail without a warning, notice or error, creating hard to track down bugs For example, something which works fine for one set of data stops working on another set because the data is just longer.

This cannot be the expected behaviour, surely?

At minimum there needs to be a warning. Ideally though, the limit needs to be put to the pcre defaults.
 [2010-04-17 06:43 UTC] wclark at xoom dot org
Please just set the PHP limits to match the default PCRE limits.  People asked for 
that three years ago.. what's the holdup?  I run into this problem quite regularly 
when using UNGREEDY matches, which frankly makes no sense (why would UN-greedy 
matches need more backtracking?) but I'll chalk that up to PCRE's poor 
implementation.  Regardless, if the PHP defaults were set higher I would never 
encounter these issues in the first place.
 [2011-02-21 21:10 UTC] jani@php.net
-Package: Feature/Change Request +Package: PCRE related -Operating System: all +Operating System: * -PHP Version: 5.2.1 +PHP Version: 5.3
 [2011-07-03 19:25 UTC] felipe@php.net
-Status: Open +Status: Closed -Assigned To: +Assigned To: felipe
 [2011-07-03 19:25 UTC] felipe@php.net
This has been changed as of PHP 5.3.7 RC1.
"Increased the backtrack limit from 100000 to 1000000"

Thanks.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Mar 29 06:01:29 2024 UTC