|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #76221 last optional capturing group is not included
Submitted: 2018-04-14 11:59 UTC Modified: 2018-04-14 14:44 UTC
From: wes dot nospam at nospam dot example dot org Assigned: cmb (profile)
Status: Duplicate Package: PCRE related
PHP Version: 7.2.4 OS:
Private report: No CVE-ID: None
 [2018-04-14 11:59 UTC] wes dot nospam at nospam dot example dot org
Not sure this is a bug, adding in case it is.

There are two capturing groups in the regexp,
so I expect $matches to be of length 3.

hope this helps...

Test script:
preg_match("/^(aaa)(?:(x)?)/", "aaay", $matches, PREG_UNMATCHED_AS_NULL);

Expected result:
array(2) {
  string(3) "aaa"
  string(3) "aaa"

Actual result:
array(2) {
  string(3) "aaa"
  string(3) "aaa"


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2018-04-14 14:25 UTC]
-Summary: last optional capturing group is not included when using PREG_UNMATCHED_AS_NULL +Summary: last optional capturing group is not included
 [2018-04-14 14:25 UTC]
PCRE2 returns to PHP the highest number of the successful capturing groups. In this pattern, while there are 2 groups the highest to capture was #1. PHP then only fills in the data for the array up to that group and doesn't continue filling in the rest after it. This happens with or without PREG_UNMATCHED_AS_NULL enabled.
> Offset values that correspond to unused subpatterns at the end of the expression are also set to PCRE2_UNSET. For
> example, if the string "abc" is matched against the pattern (abc)(x(yz)?)? subpatterns 2 and 3 are not matched. The
> return from the function is 2, because the highest used capturing subpattern number is 1.

You can see that the option itself works correctly by having a later capturing group match, such as with /^(aaa)(?:(x)?)(y)/. Then [1] and [3] will be set while [2] will be null/an empty string.

The most obvious solution is for PHP to determine the number of capturing groups with pcre2_pattern_info() and the PCRE2_INFO_CAPTURECOUNT option instead of using the return value from pcre2_match/pcre2_jit_match.
 [2018-04-14 14:39 UTC]
-Status: Open +Status: Duplicate -Assigned To: +Assigned To: cmb
 [2018-04-14 14:39 UTC]
This is basically a duplicate of bug #73948.  While the solution
to that bug was not clear due to the BC break, since we have
PREG_UNMATCHED_AS_NULL now, we should cater to that.

@requinix: the number of subpatterns is already available in

[1] <>
 [2018-04-14 14:44 UTC]
I thought this sounded familiar...
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Jul 13 08:01:29 2024 UTC