|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2009-08-23 08:10 UTC] laszlo dot janszky at gmail dot com
Description:
------------
I developed a recursive regex pattern for compile template patterns. During the tests I found this bug. I managed to restrict it to the following piece of code.
The count of the numbers, and every character (\n too) counts. So if I have 11+ characters long string in the 'y'-s block, then it's buggy, but by 10- character long strings it works fine.
I hope it's a real bug, and not a damage in my computer. :-)
Reproduce code:
---------------
$pattern='%.*?(?:([a-z])(?:.*?(?:(?R).*?)*?\1)?|$)%sD';
$test='
x
0123456789
x
y
01234567890
y';
preg_match_all($pattern,$test,$matches,PREG_SET_ORDER);
var_dump($matches);
Expected result:
----------------
array(3) { [0]=> array(2) { [0]=> string(18) " x 0123456789 x" [1]=> string(1) "x" } [1]=> array(2) { [0]=> string(19) " y 01234567890 y" [1]=> string(1) "y" } [2]=> array(1) { [0]=> string(0) "" } }
Actual result:
--------------
array(0) { }
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Wed Oct 29 09:00:01 2025 UTC |
Original pattern was this: '%(?<string>.*?)(?:{\\s*(?<function>[a-z0-9_]+)(?:\\s*(?:(?<hash>(?:(?:\\s+[a-z0-9_]+\\s*=\\s*)?(?:\\$[a-z0-9_]+(?:->[a-z0-9_]+|\\.[a-z0-9_]+)*|\\d+(?:\\.\\d+)?|".*?(?:\\\\".*?)*"))+)|(?<chain>(?:(?:\\s+[a-z0-9_]+(?: [a-z0-9_]+)*\\s+)?(?:\\$[a-z0-9_]+(?:->[a-z0-9_]+|\\.[a-z0-9_]+)*|\\d+(?:\\.\\d+)?|".*?(?:\\\\".*?)*"))+)|(?<list>(?:\\$[a-z0-9_]+(?:->[a-z0-9_]+|\\.[a-z0-9_]+)*|\\d+(?:\\.\\d+)?|".*?(?:\\\\".*?)*")(?:\\s*,\\s*(?:\\$[a-z0-9_]+(?:->[a-z0-9_]+|\\.[a-z0-9_]+)*|\\d+(?:\\.\\d+)?|".*?(?:\\\\".*?)*"))*)))?(?:\\s*}(?<block>.*?(?:(?R).*?)*?){\\s*/(?P=function))?\\s*}|{\\s*\\$(?<variable>[a-z0-9_]+(?:->[a-z0-9_]+|\\.[a-z0-9_]+)*)\\s*}|{\\s*\\*(?<comment>.*?)\\*\\s*}|$)%sDu' This pattern matches on similar tokens like Smarty uses. I need the %string_before(?:function_with_recursive_block|variable|comment|$)% structure because I have to capture the string before the token too, and the fastest way for that is this. With offset capture and a %function_with_recursive_block|variable|comment% structured regex I can do this too, but it's the slower way, cause I have to call strlen and substr functions in a loop. So I need that .*? :-) But recursive patterns have a strange behavior. I thought that '%.*?(?:([a-z])(?:(?R)*?\1)?|$)%sD' has to work too, but it didn't. Logically, the (?R)*? means here: "string+token...+string+end_of_the_recursive_part", but "$" is the end of the whole string, and not the end of the recursive part. :S