php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #70699 Regex limits on repeating patterns
Submitted: 2015-10-12 19:25 UTC Modified: 2016-07-04 14:13 UTC
Votes:2
Avg. Score:5.0 ± 0.0
Reproduced:2 of 2 (100.0%)
Same Version:2 (100.0%)
Same OS:1 (50.0%)
From: bugs dot php dot net at ss dot st dot tc Assigned: cmb (profile)
Status: Duplicate Package: *Regular Expressions
PHP Version: 7.0.0RC4 OS: OSX and Linux
Private report: No CVE-ID: None
 [2015-10-12 19:25 UTC] bugs dot php dot net at ss dot st dot tc
Description:
------------
Following examples work well in PHP 5, but fail to produce correct results in PHP 7.

Test script:
---------------
<?php

echo preg_match('/((A)+)/', str_repeat('A', 1363)) ? 1 : 0, PHP_EOL;
echo preg_match('/((A)+)/', str_repeat('A', 1364)) ? 1 : 0, PHP_EOL;

echo preg_match('/((A)*)/', str_repeat('A', 1363)) ? 1 : 0, PHP_EOL;
echo preg_match('/((A)*)/', str_repeat('A', 1364)) ? 1 : 0, PHP_EOL;

echo preg_match_all('/((A)+)/', str_repeat('A', 1363)) ? 1 : 0, PHP_EOL;
echo preg_match_all('/((A)+)/', str_repeat('A', 1364)) ? 1 : 0, PHP_EOL;

echo preg_match_all('/((A)*)/', str_repeat('A', 1363)) ? 1 : 0, PHP_EOL;
echo preg_match_all('/((A)*)/', str_repeat('A', 1364)) ? 1 : 0, PHP_EOL;


Expected result:
----------------
Both PHP 5 and 7:
1
1
1
1
1
1
1
1


Actual result:
--------------
PHP 5:
1
1
1
1
1
1
1
1

PHP 7:
1
0
1
0
1
0
1
0


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2015-10-12 20:32 UTC] requinix@php.net
-Status: Open +Status: Feedback
 [2015-10-12 20:32 UTC] requinix@php.net
You're probably hitting the backtrack limit in PCRE. Compare the values of the pcre.backtrack_limit setting on your two setups.
 [2015-10-12 21:32 UTC] bugs dot php dot net at ss dot st dot tc
-Status: Feedback +Status: Open
 [2015-10-12 21:32 UTC] bugs dot php dot net at ss dot st dot tc
pcre.backtrack_limit is set to default value of 1000000 in both configurations.
 [2015-10-12 21:37 UTC] rasmus@php.net
The limits work a bit different with the jit enabled I think. I bet if you set pcre.jit=0 in PHP 7 you will get the same result.
 [2015-10-12 22:01 UTC] bugs dot php dot net at ss dot st dot tc
You're right Rasmus, jit was the reason. So how do we control the limits with jit then?
 [2015-10-12 22:11 UTC] rasmus@php.net
This is getting more into a PCRE question than a PHP one. See: http://www.pcre.org/original/doc/html/pcrejit.html where it says:

  The error code PCRE_ERROR_MATCHLIMIT is returned by the JIT code if searching a 
  very large pattern tree goes on for too long, as it is in the same 
  circumstance when JIT is not used, but the details of exactly what is counted 
  are not the same. The PCRE_ERROR_RECURSIONLIMIT error code is never returned 
  by JIT execution.
 [2015-10-12 22:35 UTC] bugs dot php dot net at ss dot st dot tc
I see. But still don't know where to go from here.
Even though I can use preg_last_error() to figure out the error, I can't disable jit temporarily to perform a desired match (ini_set() doesn't actually enable/disable jit, although it does change runtime ini value (is that expected behaviour?)).
I understand that backtracking can be catastrophic sometimes, yet I didn't expect jit to bump me that fast.
Giving up jit totally by setting ini value to 0 would be my last resort.
I'm lost, confused, depressed.
 [2015-10-12 23:05 UTC] rasmus@php.net
Well, if you read that pcre doc closely, you aren't actually hitting the backtrack limit. What is happening is that the jit is running for too long in which case pcre returns the same PCRE_ERROR_MATCHLIMIT error you get when hitting the match limit without the JIT. I agree this is an annoying api from pcre, but short of patching pcre I am not sure what we can do about it unless there is some other knows we can tune on the jit.

Your example doesn't actually backtrack that much. To get the same output in PHP 5.5 you have to set pcre.backtrack_limit=2732

So one way to ensure your stuff will work with the jit might be to lower your backtrack_limit way down to something like 1000 and make sure all your regular expressions are optimized to not backtrack very far. That will make them run a lot faster too. Or turn off the jit.
 [2015-11-07 05:43 UTC] hzmester at freemail dot hu
To me it seems you reached the 32K JIT stack limit. Normally applications use the JIT stack API to support a growable stack. Is it enabled in php?
 [2016-07-04 14:12 UTC] cmb@php.net
-Status: Open +Status: Duplicate -Assigned To: +Assigned To: cmb
 [2016-07-04 14:12 UTC] cmb@php.net
> To me it seems you reached the 32K JIT stack limit.

Thanks for the hint, Zoltán.

> Normally applications use the JIT stack API to support a
> growable stack. Is it enabled in php?

AFAIK that's still not the case. I've started a discussion last
year[1], but wasn't able to actually do a PR. Have to catch up on
that.

Anyhow, this ticket appears to be basically a duplicate of bug
#70110.
 [2016-07-04 14:13 UTC] cmb@php.net
Forgot the footnote:
[1] <http://marc.info/?l=php-internals&m=143764967808306&w=2>
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 19 11:01:28 2024 UTC