php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #75181 PHP's PCRE API doesn't differentiate empty matches from non-matches
Submitted: 2017-09-11 05:23 UTC Modified: 2017-09-11 06:56 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: jocrutrisi at ibsats dot com Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 7.2.0RC1 OS: All
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: jocrutrisi at ibsats dot com
New email:
PHP Version: OS:

 

 [2017-09-11 05:23 UTC] jocrutrisi at ibsats dot com
Description:
------------
PCRE differentiates "a sub-pattern that matched and is empty" from "a sub-pattern that didn't match". But in PHP's exposed APIs both of these cases produce a sub-pattern key with an empty match.

Probably this can be a flag, for unfortunate B.C. reasons, but the current behavior is certainly not correct.

Test script:
---------------
$re = '%^
    ((?=x))?
    ((?=y))?
    (\w)
$%mx';

$str = 'x
y
z';

preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);

// Print the entire match result
var_dump($matches);

Expected result:
----------------
Observe the identical PCRE regex and input on Regex101, notice the flag "isParticipating" clearly points out which pattern matched (even if empty) or didn't match:

[
  [
    {
      "content": "x",
      "isParticipating": true,
      "groupNum": 0,
    },
    {
      "content": "",
      "isParticipating": true,
      "groupNum": 1,
    },
    {
      "content": "",
      "isParticipating": false,
      "groupNum": 2,
    },
    {
      "content": "x",
      "isParticipating": true,
      "groupNum": 3,
    }
  ],
  [
    {
      "content": "y",
      "isParticipating": true,
      "groupNum": 0,
    },
    {
      "content": "",
      "isParticipating": false,
      "groupNum": 1,
    },
    {
      "content": "",
      "isParticipating": true,
      "groupNum": 2,
    },
    {
      "content": "y",
      "isParticipating": true,
      "groupNum": 3,
    }
  ],
  [
    {
      "content": "z",
      "isParticipating": true,
      "groupNum": 0,
    },
    {
      "content": "",
      "isParticipating": false,
      "groupNum": 1,
    },
    {
      "content": "",
      "isParticipating": false,
      "groupNum": 2,
    },
    {
      "content": "z",
      "isParticipating": true,
      "groupNum": 3,
    }
  ]
]




Source: https://regex101.com/r/N90TJE/1

Actual result:
--------------
Both matching-and-empty and non-matching are just empty strings, we can't tell them apart:

array(3) {
  [0]=>
  array(4) {
    [0]=>
    string(1) "x"
    [1]=>
    string(0) ""
    [2]=>
    string(0) ""
    [3]=>
    string(1) "x"
  }
  [1]=>
  array(4) {
    [0]=>
    string(1) "y"
    [1]=>
    string(0) ""
    [2]=>
    string(0) ""
    [3]=>
    string(1) "y"
  }
  [2]=>
  array(4) {
    [0]=>
    string(1) "z"
    [1]=>
    string(0) ""
    [2]=>
    string(0) ""
    [3]=>
    string(1) "z"
  }
}

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2017-09-11 06:28 UTC] spam2 at rhsoft dot net
read the manual again and just check the return vlue2
 [2017-09-11 06:56 UTC] kelunik@php.net
-Status: Open +Status: Not a bug
 [2017-09-11 06:56 UTC] kelunik@php.net
PREG_UNMATCHED_AS_NULL has been introduced in PHP 7.2 as a flag.

See https://3v4l.org/bu1Sf#v720alpha1.
 [2017-09-11 09:10 UTC] jocrutrisi at ibsats dot com
kelunik@php.net

This flag is EXACTLY what was needed, excellent! But it's not documented on php.net yet, hence the report. Sorry for the noise :)
 [2017-09-12 12:51 UTC] kelunik@php.net
Please open a second bug with type "Doc Bug" and a description that just documentation is missing. :-)
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Wed Jan 15 10:01:29 2025 UTC