|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #76188 Change in treating unescaped - in character classes
Submitted: 2018-04-05 15:18 UTC Modified: 2018-04-05 21:44 UTC
From: andi at splitbrain dot org Assigned:
Status: Not a bug Package: PCRE related
PHP Version: master-Git-2018-04-05 (Git) OS: Linux
Private report: No CVE-ID: None
 [2018-04-05 15:18 UTC] andi at splitbrain dot org
In regular expressions an unescaped minus character (-) in character classes is usually used to declare a range. However when the range makes no sense, it is treated as a literal minus character. This seems still to be true for most occasions in 7.3-dev except for one special combination, where the minus sits between a shortcut character class and a dollar sign. Eg. /[\w-$]/. Previous PHP versions would treat this as word characters, minus and dollar. PHP 7.3-dev throws an error.

Test script:
$tests = [                                                                      
echo PHP_VERSION;                                                               
echo "\n";                                                                      
foreach($tests as $test) {                                                      
    echo "$test\n";                                                             

Expected result:

Actual result:

Warning: preg_match(): Compilation failed: invalid range in character class at offset 3 in /var/www/test.php on line 16


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2018-04-05 16:48 UTC]
That's likely a behavioral difference between PCRE and PCRE2.
 [2018-04-05 21:44 UTC]
-Status: Open +Status: Not a bug
 [2018-04-05 21:44 UTC]
It's PCRE2.

> An error is generated if a POSIX character class (see below) or an escape sequence other than one that defines a
> single character appears at a point where a range ending character is expected. For example, [z-\xff] is valid, but
> [A-\d] and [A-[:digit:]] are not.

Note that only applied for the range end, meaning [\d-A] was allowed.
If you update $tests to include [$-\w] then you'll see the error.

> Perl treats a hyphen as a literal if it appears before or after a POSIX class (see below) or before or after a
> character type escape such as as \d or \H. However, unless the hyphen is the last character in the class, Perl
> outputs a warning in its warning mode, as this is most likely a user error. As PCRE2 has no facility for warning,
> an error is given in these cases.

It's now an error to have it both before or after the hyphen.
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Wed Jul 17 23:01:28 2024 UTC