php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #34695 preg_replace(): {}-expresion overflow
Submitted: 2005-09-30 23:47 UTC Modified: 2005-10-03 09:41 UTC
From: php at koterov dot ru Assigned:
Status: Not a bug Package: Reproducible crash
PHP Version: 4.4.0 OS: Windows XP
Private report: No CVE-ID: None
 [2005-09-30 23:47 UTC] php at koterov dot ru
Description:
------------
PCRE /X{1,Y}/ for large Y (near 1000) does not work for some X on Windows (apache 1.3+mod_php 4.4.0 or 4.3.10). In Unix (Linux) - everything is fine. Maybe stack overflow?

Reproduce code:
---------------
<?
define('MAXLEN', 1000);
$text = str_repeat('a', 10000);
$text = preg_replace('/ ( (?: [^<] | < [^>]* >){1,'.MAXLEN.'}) .* /xs', '$1', $text);
die('ok');
?>

Expected result:
----------------
ok

Actual result:
--------------
nothing (php exits, but no windows GPF)

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-10-02 08:46 UTC] php at koterov dot ru
Snapshot does not work too.
 [2005-10-02 13:00 UTC] sniper@php.net
RTFM: "You should be aware of some limitations of PCRE. Read http://www.pcre.org/pcre.txt for more info."

 [2005-10-03 08:42 UTC] php at koterov dot ru
First. Quoting this article:
<<<
       There are some size limitations in PCRE but it is hoped that they  will
       never in practice be relevant.

       The  maximum  length of a compiled pattern is 65539 (sic) bytes if PCRE
       is compiled with the default internal linkage size of 2. If you want to
       process  regular  expressions  that are truly enormous, you can compile
       PCRE with an internal linkage size of 3 or 4 (see the  README  file  in
       the  source  distribution and the pcrebuild documentation for details).
       In these cases the limit is substantially larger.  However,  the  speed
       of execution will be slower.

       All values in repeating quantifiers must be less than 65536.  The maxi-
       mum number of capturing subpatterns is 65535.

       There is no limit to the number of non-capturing subpatterns,  but  the
       maximum  depth  of  nesting  of  all kinds of parenthesized subpattern,
       including capturing subpatterns, assertions, and other types of subpat-
       tern, is 200.

       The  maximum  length of a subject string is the largest positive number
       that an integer variable can hold. However, when using the  traditional
       matching function, PCRE uses recursion to handle subpatterns and indef-
       inite repetition.  This means that the available stack space may  limit
       the size of a subject string that can be processed by certain patterns.
>>>

Which limitation did you mean? As you can see, expression 

/((?:[^<]|<[^>]*>){1,1000}).*/xs

does not break any limitation bounds quoted above.

Second. The same RE works in Perl 5.6 and 5.8 even with {1,10000} repeating quantifiers. But Perl 5.6 uses wittingly older version of libpcre than PHP 4.4.0. So, possibly source of bug is not in PCRE, but in PHP?..

Third. It is NOT a backtracking overflow, because for string with 1001 "a"'s this expression does not work too.
 [2005-10-03 08:49 UTC] rasmus@php.net
Perl doesn't use libpcre at all.  Try this with the command-line pcre test client.  If it works there and not in PHP, then you can blame PHP.
 [2005-10-03 09:41 UTC] php at koterov dot ru
You are right:

> pcretest.exe
  re> / ( (?: [^<] | < [^>]* >){1,1000}) .* /xs
Failed: regular expression too large at offset 0

Don't know why, but - this RE was not eaten by PCRE lib. 

Sorry.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue May 07 04:01:30 2024 UTC