|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2009-11-22 18:47 UTC] laszlo dot janszky at gmail dot com
Description:
------------
I have a huge recursive regex (about 500bytes), which needs a lot of memory for backtrace.
The regex matches on templates like
{command1 arg1=$arg1 arg2=$arg2|modifier2 arg3="text"|modifier3:modarg31:modarg32}
etc....
If I use the regex with preg_match_all, then the backtrace memory usage depends on the count of the commands superexponential.
So:
R^2 = 0,9977 (R^2 for trendline)
ln ln M = 0,0787 * N + 1,9304
[M] = used backtrack memory in bytes
[N] = number of command calls
It don't think that more than 1Mb memory usage is normal for a 0.0002Mb string.
The recursion memory usage is normal(under 1kb). I'm pretty disappointed because I can't use my template engine because of a badly written pcre engine.
Reproduce code:
---------------
$template1='
{display var=$link}
{display var=$link}
{display var=$link}
{display var=$link}
{display var=$link}
{display var=$link}
{display var=$link}
{display var=$link}
{display var=$link}
{display var=$link}
';
$template2='
{display var=$link}
{display var=$link}
{display var=$link}
{display var=$link}
test test test test test
test test test test test
test test test test test
test test test test test
test test test test test
test test test test test
test test test test test
test test test test test
test test test test test
test test test test test
';
$regex='%\\{(?<function>(?:\\w+))(?:(?<list>\\s(?:[\\w_]+(?:\\s[\\w_]+)*\\s)?(?:\\$\\w+(?:->\\w+|\\.\\w+)*|"(?:.*?)"|\\d+(?:\\.\\d+)?)(?:\\|\\w+(?::(?:\\$\\w+(?:->\\w+|\\.\\w+)*|"(?:.*?)"|\\d+(?:\\.\\d+)?))*)*(?:\\s[\\w_]+(?:\\s[\\w_]+)*\\s(?:\\$\\w+(?:->\\w+|\\.\\w+)*|"(?:.*?)"|\\d+(?:\\.\\d+)?)(?:\\|\\w+(?::(?:\\$\\w+(?:->\\w+|\\.\\w+)*|"(?:.*?)"|\\d+(?:\\.\\d+)?))*)*)*(?:\\s[\\w_]+(?:\\s[\\w_]+)*)?)|(?<hash>(?:\\s\\w+=(?:\\$\\w+(?:->\\w+|\\.\\w+)*|"(?:.*?)"|\\d+(?:\\.\\d+)?)(?:\\|\\w+(?::(?:\\$\\w+(?:->\\w+|\\.\\w+)*|"(?:.*?)"|\\d+(?:\\.\\d+)?))*)*)*))(?:\\}(?<block>.*?(?:(?0).*?)*?)\\{/(?P=function))?\\}%usD';
$one_Mb=1024*1024;
$one_kb=1024;
ini_set('pcre.backtrack_limit', $one_Mb);
ini_set('pcre.recursion_limit', $one_kb);
preg_match_all($regex,$template1,$matches1,PREG_SET_ORDER);
preg_match_all($regex,$template2,$matches2,PREG_SET_ORDER);
echo 'test1:<br />';
echo (!count($matches1)?'failed':'ok').'<br />';
echo 'test2:<br />';
echo (!count($matches2)?'failed':'ok').'<br />';
Expected result:
----------------
test1:
ok
test2:
ok
Actual result:
--------------
test1:
failed
test2:
ok
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Wed Nov 05 00:00:02 2025 UTC |
If I remove the recursive part (?:\\}(?<block>.*?(?:(?0).*?)*?)\\{/(?P=function))? from the end of the regex, then it works fine...If it is not clear, by the test: the 8 tokens withBlock (M1) test string is: $test=' {display} {display} {display} {display} {display} {display} {display} {display} {/display} {/display} {/display} {/display} {/display} {/display} {/display} {/display} '; and the 8 tokens withoutBlock (M2) test string is: $test=' {display} {display} {display} {display} {display} {display} {display} {display} ';Ok. For last, here is my backtrace memory tester. If it's not a bug, then bye bye. <?php header('content-type: text/html; charset=utf-8'); header("Cache-Control: no-cache, must-revalidate"); header("Expires: Sat, 26 Jul 1997 05:00:00 GMT"); ini_set('pcre.recursion_limit', 10000); $kb=1024; $Mb=1024*1024; $max_memory_size=50*$Mb; $pattern= '% {(\w+)(?:} (.*?(?:(?0).*?)*?) {/\1)?} %usDx'; if (isset($_POST['test']) && $_POST['test']!='') { ini_set('pcre.backtrack_limit', $max_memory_size); if (preg_match_all($pattern,$_POST['test'],$m,PREG_SET_ORDER)) { $a=1; $b=$max_memory_size; while (abs($a-$b)>1) { $c=(int)(($a+$b)/2); ini_set('pcre.backtrack_limit', $c); if (preg_match_all($pattern,$_POST['test'],$m,PREG_SET_ORDER)) { $b=$c; } else { $a=$c; } } } ?><div style="float: left; margin: 10px; "><?php if (isset($c)) { ?>A fogyasztott memória:<br /><?php echo ($c/$kb); ?>.kb.<br /><?php } else { ?>A memória fogyasztás túllépte az engedélyezett kvótát, vagy a minta nem illeszkedik.<br /><?php } ?><br /><br />A tesztelt szöveg: <br /><pre><?php echo $_POST['test']; ?></pre><?php ?></div><?php } ?> <div style="float: left; margin: 10px;"> <form action="<?php echo $_SERVER['PHP_SELF']; ?>" method="post" enctype="application/x-www-urlencoded; charset=utf-8"> <input type="submit" value="test1" style="vertical-align: top" /> <textarea id="test1" name="test" rows="18"> 1.){display} 2.){display} 3.){display} 4.){display} 5.){display} 6.){display} 7.){display} 8.){display} 9.){display} </textarea> </form> </div> <div style="float: left; margin: 10px;"> <form action="<?php echo $_SERVER['PHP_SELF']; ?>" method="post" enctype="application/x-www-urlencoded; charset=utf-8"> <input type="submit" value="test2" style="vertical-align: top" /> <textarea id="test2" name="test" rows="18"> 1.){display} 2.){display} 3.){display} 4.){display} 5.){display} 6.){display} 7.){display} 8.){display} 9.){display} {/display} {/display} {/display} {/display} {/display} {/display} {/display} {/display} {/display} </textarea> </form> </div>