php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #67235 Possessive quantifier problem with preg_match()
Submitted: 2014-05-08 20:41 UTC Modified: 2014-05-12 12:56 UTC
From: david dot a dot schmitt at verizon dot com Assigned:
Status: Closed Package: PCRE related
PHP Version: 5.5.12 OS: Linux (Fedora 20)
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: david dot a dot schmitt at verizon dot com
New email:
PHP Version: OS:

 

 [2014-05-08 20:41 UTC] david dot a dot schmitt at verizon dot com
Description:
------------
1. preg_match() behaviour does not match pcregrep behaviour for some patterns involving possessive quantifiers.
2. preg_match() behaviour also appears internally inconsistent when switching from greedy to possessive quantifiers in those cases (+ to ++).

Below is my environment (Fedora 20):

$ php -V
PHP 5.5.12 (cli) (built: May  3 2014 07:10:11) 
Copyright (c) 1997-2014 The PHP Group
Zend Engine v2.5.0, Copyright (c) 1998-2014 Zend Technologies
$ pcregrep -V
pcregrep version 8.33 2013-05-28
$ php -r "phpinfo();" | grep PCRE
PCRE (Perl Compatible Regular Expressions) Support => enabled
PCRE Library Version => 8.33 2013-05-28

Test script:
---------------
#!/bin/sh

export GREEDY='\A(?:[^"]++|"(?:[^"]*+|"")*+")+'
export POSSESSIVE="$GREEDY+"
export TEXT='NON QUOTED "QUOT""ED"'
export PATTERN

SCRIPT='
$t = getenv("TEXT");
$p = getenv("PATTERN");
preg_match("/$p/", $t, $m);
print "$m[0]\n";
'

for PATTERN in "$GREEDY" "$POSSESSIVE"
do
    echo "$TEXT" | pcregrep -o -e "$PATTERN"
    php -r "$SCRIPT"
done


Expected result:
----------------
NON QUOTED "QUOT""ED"
NON QUOTED "QUOT""ED"
NON QUOTED "QUOT""ED"
NON QUOTED "QUOT""ED"


Actual result:
--------------
NON QUOTED "QUOT""ED"
NON QUOTED "QUOT""ED"
NON QUOTED "QUOT""ED"
NON QUOTED 


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2014-05-09 13:39 UTC] david dot a dot schmitt at verizon dot com
FYI, to clarify the script results you could change the TEXT setting in the script as follows:

export TEXT='NON QUOTED "QUOT""ED""NOT MATCHED'

Even with that change the expected and actual results remain the same.
 [2014-05-11 12:52 UTC] felipe@php.net
Well, could you clarify what is wrong here? Just give us the pattern which works differently from pcretest.
 [2014-05-11 12:53 UTC] felipe@php.net
-Status: Open +Status: Feedback
 [2014-05-12 12:48 UTC] david dot a dot schmitt at verizon dot com
-Status: Feedback +Status: Open
 [2014-05-12 12:48 UTC] david dot a dot schmitt at verizon dot com
One example pattern that works differently is the POSSESSIVE pattern in the test script.  i.e.:  \A(?:[^"]++|"(?:[^"]*+|"")*+")++

(Obviously, you must surround that pattern with // for PHP, but not for pcregrep).

Use the following text string to observe the difference:

NON QUOTED "QUOT""ED" AFTER "NOT MATCHED

echo "$TEXT" | pcregrep -o -e "$PATTERN" will display:

NON QUOTED "QUOT""ED" AFTER 

but after preg_match($pattern, $text, $matches) $matches[0] only includes:

NON QUOTED 

I originally observed the problem with a longer, more complex pattern, so this must be a more general problem than the single example pattern I am providing here.
 [2014-05-12 12:56 UTC] david dot a dot schmitt at verizon dot com
-Status: Open +Status: Closed
 [2014-05-12 12:56 UTC] david dot a dot schmitt at verizon dot com
How odd.  I just tested with pcretest instead of pcregrep -o and got different results.  This now looks like a PCRE problem, not PHP, so I'm closing this bug.

Here's my test script for future reference:

#!/bin/sh

export PATTERN='\A(?:[^"]++|"(?:[^"]*+|"")*+")++'
export TEXT='NON QUOTED "QUOT""ED" AFTER "NOT MATCHED'

cat <<EOF |
/$PATTERN/
$TEXT
EOF
pcretest

echo "$TEXT" | pcregrep -o -e "$PATTERN"
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Thu Jul 03 12:01:33 2025 UTC