|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #81537 Error in regular expression using negation.
Submitted: 2021-10-18 08:32 UTC Modified: 2021-12-01 15:11 UTC
Avg. Score:4.5 ± 0.5
Reproduced:2 of 2 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: jansverre at uffe dot no Assigned:
Status: Verified Package: PCRE related
PHP Version: 8.0.11 OS: Ubuntu Linux 20.04
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2021-10-18 08:32 UTC] jansverre at uffe dot no
Regular expression that *should* match a text does not match it. In the code sample, the second preg_match will match the link in the string, but not the first, and the only difference is that the input string contains an extra 'a' letter.

Test script:
$regex = '/(^|[\n\s\>\\";]+)([^ \,"\t\n\r<>;]+\.[^ \,\"\t\n\r<]+)/iu';

$text = 'This is a normal text., VG';
preg_match_all($regex, $text, $matches);

$text = 'This is normal text., VG';
preg_match_all($regex, $text, $matches);

Expected result:
Both $matches should contain the link

Actual result:
Only the last $matches contains the link.


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2021-10-18 08:43 UTC]
-Status: Open +Status: Verified
 [2021-10-18 08:43 UTC]
This is a PCRE JIT issue (<> vs.
 [2021-12-01 15:11 UTC]
This is apparently fixed in PCRE2 10.39 (or maybe earlier),
bundled as of PHP 8.1.1.  I'm not sure whether we should update
PHP-8.0 to this PCRE2 version.
 [2022-01-04 22:08 UTC] sebastiaodomingos05 at gmail dot com
As I can see, it is rather related to an extra space after adding the letter 'a'.
 [2022-09-21 08:24 UTC] antonio at softcodex dot ch
Not sure if it's the same issue.
I have this regex (see below) that doesn't work on 

PHP 7.3.33: 
but WILL work on the windows binary and "php:7.3-apache" docker image

PHP 8.1.10: only tested the windows binary 

preg_match( '/<(\w+)[\s\w\-]+ id="S44_i89ew">/', '<br><div id="S44_i89ew">' , $matches );
expected result: 
array(2) { 
[0]=> string(20) "<div id="S44_i89ew">" [1]=> string(3) "di" 

actial result:
array(0) { }

The strangely if you change the + quantifier after [\s\w\-] to * or ? it works.
 [2023-01-11 06:29 UTC] MadeleineAshton at jourrapide dot com
Similarly, the negation variant of the character class is defined as "[^ ]" (with ^ within the square braces), it matches a single character which is not in the specified or set of possible characters. For example the regular expression [^abc] matches a single character except a or, b or, c.

 [2023-01-23 08:28 UTC] ygwgs79hvh at gmail dot com
The problem is that in regular expressions the negation must go inside square brackets. That is not right. Square brackets are used to match a class of characters (even when negated with a ^), which match a single character.

 [2023-08-09 14:34 UTC] jouni dot makelainen at gmail dot com
I have similar issue with this (simplified) regexp. I tested this with PHP docker images. It works in 7.4-cli and 8.0-cli, but not in 8.1-cli (PCRE 10.39) or 8.2-cli (PCRE 10.40). 

Also works if negation is replaced with whitespace. 

Test script:
echo preg_match('/(<\w+[^>]+href>)/', '<li><a href>');

Excepted result:

Actual result:
 [2024-05-17 08:40 UTC] robert2003blodgett at outlook dot com
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Jul 25 15:01:29 2024 UTC