php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #75181 PHP's PCRE API doesn't differentiate empty matches from non-matches
Submitted: 2017-09-11 05:23 UTC Modified: 2017-09-11 06:56 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: jocrutrisi at ibsats dot com Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 7.2.0RC1 OS: All
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: jocrutrisi at ibsats dot com
New email:
PHP Version: OS:

 

 [2017-09-11 05:23 UTC] jocrutrisi at ibsats dot com
Description:
------------
PCRE differentiates "a sub-pattern that matched and is empty" from "a sub-pattern that didn't match". But in PHP's exposed APIs both of these cases produce a sub-pattern key with an empty match.

Probably this can be a flag, for unfortunate B.C. reasons, but the current behavior is certainly not correct.

Test script:
---------------
$re = '%^
    ((?=x))?
    ((?=y))?
    (\w)
$%mx';

$str = 'x
y
z';

preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);

// Print the entire match result
var_dump($matches);

Expected result:
----------------
Observe the identical PCRE regex and input on Regex101, notice the flag "isParticipating" clearly points out which pattern matched (even if empty) or didn't match:

[
  [
    {
      "content": "x",
      "isParticipating": true,
      "groupNum": 0,
    },
    {
      "content": "",
      "isParticipating": true,
      "groupNum": 1,
    },
    {
      "content": "",
      "isParticipating": false,
      "groupNum": 2,
    },
    {
      "content": "x",
      "isParticipating": true,
      "groupNum": 3,
    }
  ],
  [
    {
      "content": "y",
      "isParticipating": true,
      "groupNum": 0,
    },
    {
      "content": "",
      "isParticipating": false,
      "groupNum": 1,
    },
    {
      "content": "",
      "isParticipating": true,
      "groupNum": 2,
    },
    {
      "content": "y",
      "isParticipating": true,
      "groupNum": 3,
    }
  ],
  [
    {
      "content": "z",
      "isParticipating": true,
      "groupNum": 0,
    },
    {
      "content": "",
      "isParticipating": false,
      "groupNum": 1,
    },
    {
      "content": "",
      "isParticipating": false,
      "groupNum": 2,
    },
    {
      "content": "z",
      "isParticipating": true,
      "groupNum": 3,
    }
  ]
]




Source: https://regex101.com/r/N90TJE/1

Actual result:
--------------
Both matching-and-empty and non-matching are just empty strings, we can't tell them apart:

array(3) {
  [0]=>
  array(4) {
    [0]=>
    string(1) "x"
    [1]=>
    string(0) ""
    [2]=>
    string(0) ""
    [3]=>
    string(1) "x"
  }
  [1]=>
  array(4) {
    [0]=>
    string(1) "y"
    [1]=>
    string(0) ""
    [2]=>
    string(0) ""
    [3]=>
    string(1) "y"
  }
  [2]=>
  array(4) {
    [0]=>
    string(1) "z"
    [1]=>
    string(0) ""
    [2]=>
    string(0) ""
    [3]=>
    string(1) "z"
  }
}

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2017-09-11 06:28 UTC] spam2 at rhsoft dot net
read the manual again and just check the return vlue2
 [2017-09-11 06:56 UTC] kelunik@php.net
-Status: Open +Status: Not a bug
 [2017-09-11 06:56 UTC] kelunik@php.net
PREG_UNMATCHED_AS_NULL has been introduced in PHP 7.2 as a flag.

See https://3v4l.org/bu1Sf#v720alpha1.
 [2017-09-11 09:10 UTC] jocrutrisi at ibsats dot com
kelunik@php.net

This flag is EXACTLY what was needed, excellent! But it's not documented on php.net yet, hence the report. Sorry for the noise :)
 [2017-09-12 12:51 UTC] kelunik@php.net
Please open a second bug with type "Doc Bug" and a description that just documentation is missing. :-)
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Wed Jan 15 07:01:29 2025 UTC