php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #50887 preg_match , last optional sub-patterns ignored when empy
Submitted: 2010-01-30 16:58 UTC Modified: 2019-03-21 15:03 UTC
Votes:15
Avg. Score:4.2 ± 0.8
Reproduced:14 of 14 (100.0%)
Same Version:3 (21.4%)
Same OS:2 (14.3%)
From: harrrrpo at gmail dot com Assigned:
Status: Wont fix Package: PCRE related
PHP Version: 5.3.1 OS: Windows
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: harrrrpo at gmail dot com
New email:
PHP Version: OS:

 

 [2010-01-30 16:58 UTC] harrrrpo at gmail dot com
Description:
------------
in preg_match , when optional sub-patterns (using ? or {0,n} ) are the last sub-patterns and empty (e.g. not matched) they are ignored in $matches array
this behavior is inconsistent with preg_match_all , and with the case when the empty optional sub-pattern isn't the last one

Reproduce code:
---------------
$str="1";
preg_match("#\d(\d)?#",$str,$mt);
var_dump($mt);

Expected result:
----------------
array(2) {
  [0]=>
  string(1) "1"
  [1]=>
  string(0) ""
}

(the string(0) "" does appear on all cases with preg_match_all , and with preg_match , when there is any additional sub-patterns after it)

Actual result:
--------------
array(1) {
  [0]=>
  string(1) "1"
}

(the value of sub-pattern vanished)

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-01-31 11:57 UTC] nlopess@php.net
I don't think we can change that behaviour at this point for the sake of not brekaing BC.
 [2011-06-12 01:52 UTC] cappuccino dot e dot cornetto at gmail dot com
I cannot imagine how fixing it would break anything older.

If I expect 3 submatches from my pattern, but I get 2, then I know (for the bug) 
that the missing submatch is the last one and it’s an empty string. So I add it 
myself to the submatches array. Would a programmer do anything different to fix this 
bug?

If the bug is fixed, it means that my old code will always get 3 submatches from 
that pattern. So my own fix won’t get triggered, and having the last submatch the 
same value (empty string) as the one my fix would have added, I won’t have any 
issue, except a bit of (stale) unused code.
 [2011-07-01 07:11 UTC] arveen dot ponniah at loginbn dot ch
hello i'm tamilboy
 [2011-07-13 08:35 UTC] c dot clix at tiscali dot it
This is not a behaviour.
This is a bug, causing occasional and unexpected errors with programs written based on the reference manual.
Fixing the bug will avoid further occasional and unexpected errors.
 [2011-08-03 12:37 UTC] cappuccino dot e dot cornetto at gmail dot com
Delving into a fix for this bug, I found that it's not limited to last optional 
groups but to any last groups. 

In fact, the following:

<code>
$regex = '(?|(Sat)ur(day)|Sun(day)?)';
preg_match("@$regex@", 'Saturday', $matches); print_r($matches);
preg_match("@$regex@", 'Sunday',   $matches); print_r($matches);
preg_match("@$regex@", 'Sun',      $matches); print_r($matches);
</code>

prints:

Array
(
    [0] => Saturday
    [1] => Sat
    [2] => day
)
Array
(
    [0] => Sunday
    [1] => day
)
Array
(
    [0] => Sun
)

While it should print:

Array
(
    [0] => Saturday
    [1] => Sat
    [2] => day
)
Array
(
    [0] => Sunday
    [1] => day
    [2] => 
)
Array
(
    [0] => Sun
    [1] => 
    [2] => 
)
 [2016-04-19 01:04 UTC] c639615 at trbvn dot com
If you can't/won't fix this bug, PLEASE AT LEAST DOCUMENT IT IN THE MANUAL!!!
 [2019-03-21 14:56 UTC] jflambert at newtrax dot com
I was about to file a duplicate defect when I found this one. Yes I agree, very aggravating that it's not at least in the documentation.

For reference these were gonna be my EXPECTED RESULTS

php > preg_match('/(a)(b)(c)*/', 'ab', $matches);
php > var_dump($matches);
php shell code:1:
array(3) {
  [0] =>
  string(2) "ab"
  [1] =>
  string(1) "a"
  [2] =>
  string(1) "b"
  [3] =>
  string(0) ""
}
php > preg_match('/(a)(b)(c)*/', 'ab', $matches, PREG_UNMATCHED_AS_NULL);
php > var_dump($matches);
php shell code:1:
array(3) {
  [0] =>
  string(2) "ab"
  [1] =>
  string(1) "a"
  [2] =>
  string(1) "b"
  [3] =>
  NULL
}
php >
 [2019-03-21 15:03 UTC] nikic@php.net
@jflambert: The UNMATCHED_AS_NULL case will be fixed in PHP 7.4, see bug #73948.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Dec 03 17:01:29 2024 UTC