php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #50887 preg_match , last optional sub-patterns ignored when empy
Submitted: 2010-01-30 16:58 UTC Modified: 2019-03-21 15:03 UTC
Votes:15
Avg. Score:4.2 ± 0.8
Reproduced:14 of 14 (100.0%)
Same Version:3 (21.4%)
Same OS:2 (14.3%)
From: harrrrpo at gmail dot com Assigned:
Status: Wont fix Package: PCRE related
PHP Version: 5.3.1 OS: Windows
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: harrrrpo at gmail dot com
New email:
PHP Version: OS:

 

 [2010-01-30 16:58 UTC] harrrrpo at gmail dot com
Description:
------------
in preg_match , when optional sub-patterns (using ? or {0,n} ) are the last sub-patterns and empty (e.g. not matched) they are ignored in $matches array
this behavior is inconsistent with preg_match_all , and with the case when the empty optional sub-pattern isn't the last one

Reproduce code:
---------------
$str="1";
preg_match("#\d(\d)?#",$str,$mt);
var_dump($mt);

Expected result:
----------------
array(2) {
  [0]=>
  string(1) "1"
  [1]=>
  string(0) ""
}

(the string(0) "" does appear on all cases with preg_match_all , and with preg_match , when there is any additional sub-patterns after it)

Actual result:
--------------
array(1) {
  [0]=>
  string(1) "1"
}

(the value of sub-pattern vanished)

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-01-31 11:57 UTC] nlopess@php.net
I don't think we can change that behaviour at this point for the sake of not brekaing BC.
 [2011-06-12 01:52 UTC] cappuccino dot e dot cornetto at gmail dot com
I cannot imagine how fixing it would break anything older.

If I expect 3 submatches from my pattern, but I get 2, then I know (for the bug) 
that the missing submatch is the last one and it’s an empty string. So I add it 
myself to the submatches array. Would a programmer do anything different to fix this 
bug?

If the bug is fixed, it means that my old code will always get 3 submatches from 
that pattern. So my own fix won’t get triggered, and having the last submatch the 
same value (empty string) as the one my fix would have added, I won’t have any 
issue, except a bit of (stale) unused code.
 [2011-07-01 07:11 UTC] arveen dot ponniah at loginbn dot ch
hello i'm tamilboy
 [2011-07-13 08:35 UTC] c dot clix at tiscali dot it
This is not a behaviour.
This is a bug, causing occasional and unexpected errors with programs written based on the reference manual.
Fixing the bug will avoid further occasional and unexpected errors.
 [2011-08-03 12:37 UTC] cappuccino dot e dot cornetto at gmail dot com
Delving into a fix for this bug, I found that it's not limited to last optional 
groups but to any last groups. 

In fact, the following:

<code>
$regex = '(?|(Sat)ur(day)|Sun(day)?)';
preg_match("@$regex@", 'Saturday', $matches); print_r($matches);
preg_match("@$regex@", 'Sunday',   $matches); print_r($matches);
preg_match("@$regex@", 'Sun',      $matches); print_r($matches);
</code>

prints:

Array
(
    [0] => Saturday
    [1] => Sat
    [2] => day
)
Array
(
    [0] => Sunday
    [1] => day
)
Array
(
    [0] => Sun
)

While it should print:

Array
(
    [0] => Saturday
    [1] => Sat
    [2] => day
)
Array
(
    [0] => Sunday
    [1] => day
    [2] => 
)
Array
(
    [0] => Sun
    [1] => 
    [2] => 
)
 [2016-04-19 01:04 UTC] c639615 at trbvn dot com
If you can't/won't fix this bug, PLEASE AT LEAST DOCUMENT IT IN THE MANUAL!!!
 [2019-03-21 14:56 UTC] jflambert at newtrax dot com
I was about to file a duplicate defect when I found this one. Yes I agree, very aggravating that it's not at least in the documentation.

For reference these were gonna be my EXPECTED RESULTS

php > preg_match('/(a)(b)(c)*/', 'ab', $matches);
php > var_dump($matches);
php shell code:1:
array(3) {
  [0] =>
  string(2) "ab"
  [1] =>
  string(1) "a"
  [2] =>
  string(1) "b"
  [3] =>
  string(0) ""
}
php > preg_match('/(a)(b)(c)*/', 'ab', $matches, PREG_UNMATCHED_AS_NULL);
php > var_dump($matches);
php shell code:1:
array(3) {
  [0] =>
  string(2) "ab"
  [1] =>
  string(1) "a"
  [2] =>
  string(1) "b"
  [3] =>
  NULL
}
php >
 [2019-03-21 15:03 UTC] nikic@php.net
@jflambert: The UNMATCHED_AS_NULL case will be fixed in PHP 7.4, see bug #73948.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 12:01:29 2024 UTC