php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #80276 (PATTERN_ORDER) Dupl named groups (?J) prefer last alternative even not matched
Submitted: 2020-10-23 14:25 UTC Modified: 2020-10-27 11:11 UTC
From: vicreal at yandex dot ru Assigned:
Status: Verified Package: PCRE related
PHP Version: 7.4.11 OS: Debian 10 x64 (4.19.132-1)
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: vicreal at yandex dot ru
New email:
PHP Version: OS:

 

 [2020-10-23 14:25 UTC] vicreal at yandex dot ru
Description:
------------
Affect named capture groups when duplicate names are enabled (?J) In 7.4.10 named capture group always returns the content captured in last alternative, no matter which of the alternatives matched.

In 7.4 this issue
https://bugs.php.net/bug.php?id=79257
was fixed only for PREG_SET_ORDER mode, but in PREG_PATTERN_ORDER mode this bug still has a place. –°hecked in version 7.4.10

Test script:
---------------
preg_match_all('/(?J)(?:(?<g>foo)|(?<g>bar))(?<h>baz)/', 'foobaz', $matches, PREG_PATTERN_ORDER);
var_dump($matches);

Expected result:
----------------
Array
(
    [0] => Array
        (
            [0] => "foobaz"
        )
    [g] => Array
        (
            [0] => "foo"
        )
    [1] => Array
        (
            [0] => "foo"
        )
    [2] => Array
        (
            [0] => ""
        )
    [h] => Array
        (
            [0] => "baz"
        )
    [3] => Array
        (
            [0] => "baz"
        )
)

Actual result:
--------------
Array
(
    [0] => Array
        (
            [0] => "foobaz"
        )
    [g] => Array
        (
            [0] => ""
        )
    [1] => Array
        (
            [0] => "foo"
        )
    [2] => Array
        (
            [0] => ""
        )
    [h] => Array
        (
            [0] => "baz"
        )
    [3] => Array
        (
            [0] => "baz"
        )
)

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-10-23 14:32 UTC] vicreal at yandex dot ru
Another example:
preg_match_all('/(?J)<(?<CHAR>\d+)>|<(?<CHAR>\w+)>/u', '1 <01> 2 <PP> 3', $matches, PREG_PATTERN_ORDER);

Actual result:
--------------
Array
(
    [0] => Array
        (
            [0] => "<01>"
            [1] => "<PP>"
        )
    [CHAR] => Array
        (
            [0] => ""    // must be "01"
            [1] => "PP"
        )
    [1] => Array
        (
            [0] => "01"
            [1] => ""
        )
    [2] => Array
        (
            [0] => ""
            [1] => "PP"
        )
)
–°hecked in version 7.4.10
 [2020-10-27 11:11 UTC] nikic@php.net
-Status: Open +Status: Verified
 [2020-10-27 11:11 UTC] nikic@php.net
I can confirm the bug, but it looks hard to fix. Would require merging result sets after the fact, where we don't even have definite information on whether a group matched or was just incidentally empty.
 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Fri Dec 04 14:01:23 2020 UTC