php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #80276 (PATTERN_ORDER) Dupl named groups (?J) prefer last alternative even not matched
Submitted: 2020-10-23 14:25 UTC Modified: 2020-10-27 11:11 UTC
From: vicreal at yandex dot ru Assigned:
Status: Verified Package: PCRE related
PHP Version: 7.4.11 OS: Debian 10 x64 (4.19.132-1)
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please — but make sure to vote on the bug!
Your email address:
MUST BE VALID
Solve the problem:
50 + 23 = ?
Subscribe to this entry?

 
 [2020-10-23 14:25 UTC] vicreal at yandex dot ru
Description:
------------
Affect named capture groups when duplicate names are enabled (?J) In 7.4.10 named capture group always returns the content captured in last alternative, no matter which of the alternatives matched.

In 7.4 this issue
https://bugs.php.net/bug.php?id=79257
was fixed only for PREG_SET_ORDER mode, but in PREG_PATTERN_ORDER mode this bug still has a place. –°hecked in version 7.4.10

Test script:
---------------
preg_match_all('/(?J)(?:(?<g>foo)|(?<g>bar))(?<h>baz)/', 'foobaz', $matches, PREG_PATTERN_ORDER);
var_dump($matches);

Expected result:
----------------
Array
(
    [0] => Array
        (
            [0] => "foobaz"
        )
    [g] => Array
        (
            [0] => "foo"
        )
    [1] => Array
        (
            [0] => "foo"
        )
    [2] => Array
        (
            [0] => ""
        )
    [h] => Array
        (
            [0] => "baz"
        )
    [3] => Array
        (
            [0] => "baz"
        )
)

Actual result:
--------------
Array
(
    [0] => Array
        (
            [0] => "foobaz"
        )
    [g] => Array
        (
            [0] => ""
        )
    [1] => Array
        (
            [0] => "foo"
        )
    [2] => Array
        (
            [0] => ""
        )
    [h] => Array
        (
            [0] => "baz"
        )
    [3] => Array
        (
            [0] => "baz"
        )
)

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-10-23 14:32 UTC] vicreal at yandex dot ru
Another example:
preg_match_all('/(?J)<(?<CHAR>\d+)>|<(?<CHAR>\w+)>/u', '1 <01> 2 <PP> 3', $matches, PREG_PATTERN_ORDER);

Actual result:
--------------
Array
(
    [0] => Array
        (
            [0] => "<01>"
            [1] => "<PP>"
        )
    [CHAR] => Array
        (
            [0] => ""    // must be "01"
            [1] => "PP"
        )
    [1] => Array
        (
            [0] => "01"
            [1] => ""
        )
    [2] => Array
        (
            [0] => ""
            [1] => "PP"
        )
)
–°hecked in version 7.4.10
 [2020-10-27 11:11 UTC] nikic@php.net
-Status: Open +Status: Verified
 [2020-10-27 11:11 UTC] nikic@php.net
I can confirm the bug, but it looks hard to fix. Would require merging result sets after the fact, where we don't even have definite information on whether a group matched or was just incidentally empty.
 
PHP Copyright © 2001-2021 The PHP Group
All rights reserved.
Last updated: Sun Apr 11 08:01:23 2021 UTC