php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #80276 (PATTERN_ORDER) Dupl named groups (?J) prefer last alternative even not matched
Submitted: 2020-10-23 14:25 UTC Modified: 2020-10-27 11:11 UTC
From: vicreal at yandex dot ru Assigned:
Status: Verified Package: PCRE related
PHP Version: 7.4.11 OS: Debian 10 x64 (4.19.132-1)
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2020-10-23 14:25 UTC] vicreal at yandex dot ru
Description:
------------
Affect named capture groups when duplicate names are enabled (?J) In 7.4.10 named capture group always returns the content captured in last alternative, no matter which of the alternatives matched.

In 7.4 this issue
https://bugs.php.net/bug.php?id=79257
was fixed only for PREG_SET_ORDER mode, but in PREG_PATTERN_ORDER mode this bug still has a place. –°hecked in version 7.4.10

Test script:
---------------
preg_match_all('/(?J)(?:(?<g>foo)|(?<g>bar))(?<h>baz)/', 'foobaz', $matches, PREG_PATTERN_ORDER);
var_dump($matches);

Expected result:
----------------
Array
(
    [0] => Array
        (
            [0] => "foobaz"
        )
    [g] => Array
        (
            [0] => "foo"
        )
    [1] => Array
        (
            [0] => "foo"
        )
    [2] => Array
        (
            [0] => ""
        )
    [h] => Array
        (
            [0] => "baz"
        )
    [3] => Array
        (
            [0] => "baz"
        )
)

Actual result:
--------------
Array
(
    [0] => Array
        (
            [0] => "foobaz"
        )
    [g] => Array
        (
            [0] => ""
        )
    [1] => Array
        (
            [0] => "foo"
        )
    [2] => Array
        (
            [0] => ""
        )
    [h] => Array
        (
            [0] => "baz"
        )
    [3] => Array
        (
            [0] => "baz"
        )
)

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-10-23 14:32 UTC] vicreal at yandex dot ru
Another example:
preg_match_all('/(?J)<(?<CHAR>\d+)>|<(?<CHAR>\w+)>/u', '1 <01> 2 <PP> 3', $matches, PREG_PATTERN_ORDER);

Actual result:
--------------
Array
(
    [0] => Array
        (
            [0] => "<01>"
            [1] => "<PP>"
        )
    [CHAR] => Array
        (
            [0] => ""    // must be "01"
            [1] => "PP"
        )
    [1] => Array
        (
            [0] => "01"
            [1] => ""
        )
    [2] => Array
        (
            [0] => ""
            [1] => "PP"
        )
)
–°hecked in version 7.4.10
 [2020-10-27 11:11 UTC] nikic@php.net
-Status: Open +Status: Verified
 [2020-10-27 11:11 UTC] nikic@php.net
I can confirm the bug, but it looks hard to fix. Would require merging result sets after the fact, where we don't even have definite information on whether a group matched or was just incidentally empty.
 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Fri Dec 04 14:01:23 2020 UTC