php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #81537 Error in regular expression using negation.
Submitted: 2021-10-18 08:32 UTC Modified: 2021-12-01 15:11 UTC
From: jansverre at uffe dot no Assigned:
Status: Verified Package: PCRE related
PHP Version: 8.0.11 OS: Ubuntu Linux 20.04
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please — but make sure to vote on the bug!
Your email address:
MUST BE VALID
Solve the problem:
29 - 4 = ?
Subscribe to this entry?

 
 [2021-10-18 08:32 UTC] jansverre at uffe dot no
Description:
------------
Regular expression that *should* match a text does not match it. In the code sample, the second preg_match will match the link in the string, but not the first, and the only difference is that the input string contains an extra 'a' letter.

Test script:
---------------
$regex = '/(^|[\n\s\>\\";]+)([^ \,"\t\n\r<>;]+\.[^ \,\"\t\n\r<]+)/iu';

$text = 'This is a normal text. http://vg.no, VG';
preg_match_all($regex, $text, $matches);
print_r($matches);

$text = 'This is normal text. http://vg.no, VG';
preg_match_all($regex, $text, $matches);
print_r($matches);

Expected result:
----------------
Both $matches should contain the link http://vg.no

Actual result:
--------------
Only the last $matches contains the link.

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2021-10-18 08:43 UTC] cmb@php.net
-Status: Open +Status: Verified
 [2021-10-18 08:43 UTC] cmb@php.net
This is a PCRE JIT issue (<https://3v4l.org/0Em35> vs.
<https://3v4l.org/I8BQ4>).
 [2021-12-01 15:11 UTC] cmb@php.net
This is apparently fixed in PCRE2 10.39 (or maybe earlier),
bundled as of PHP 8.1.1.  I'm not sure whether we should update
PHP-8.0 to this PCRE2 version.
 [2022-01-04 22:08 UTC] sebastiaodomingos05 at gmail dot com
As I can see, it is rather related to an extra space after adding the letter 'a'.
 
PHP Copyright © 2001-2022 The PHP Group
All rights reserved.
Last updated: Wed Jan 19 11:03:16 2022 UTC