php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #81180 Using preg_match,preg_match_all even for non-matching conditions will cause php
Submitted: 2021-06-20 07:53 UTC Modified: 2021-06-20 08:22 UTC
From: noreply at example dot com Assigned:
Status: Not a bug Package: *Regular Expressions
PHP Version: 7.4.20 OS: Windows,freebsd
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: noreply at example dot com
New email:
PHP Version: OS:

 

 [2021-06-20 07:53 UTC] noreply at example dot com
Description:
------------
FUNCTION: preg_match_all, preg_match

pattern : \S* \w* \>
|([\S]*)a\>|
|([\w]*)a\>|

subject : 25k byte over causes intolerable delay 
>	   : two or more
other byte : any byte 


$matches = array();
$pattern = '|'. '([\w]*)'. preg_quote('a>') .'|';
$subject = '>'.str_repeat('1', 500000).'>';
preg_match_all($pattern, $subject,  $matches);


In a general programming language, if you pass a parameter with the same conditions to a regular expression function, it will respond in 0 seconds.

Test script:
---------------
printf("PHP(%s,%s) ", phpversion(), php_sapi_name()); 
$pattern = '|'. '([\S]*)'. preg_quote('a>') .'|';  // \w  \S
printf("pattern : $pattern\n");
foreach(array(1,10,25,50,100,250,500) as $i)
{
    $datasize = $i*1000; $matches = array();
    $subject = '>'.str_repeat('1', $datasize-2).'>';
    $starttime = microtime(true);
    preg_match_all($pattern, $subject,  $matches, PREG_SET_ORDER); // php freeze very long time
    $elapsedtime = (microtime(true) - $starttime);
    printf("%3.0fKB %8.2f sec\n", $datasize/1000, $elapsedtime);
}

Expected result:
----------------
The runtime is 0 or few secounds.

Actual result:
--------------
PHP(8.0.7,cli) pattern : |([\S]*)a\>|
  1KB     0.01 sec
 10KB     0.20 sec
 25KB     1.41 sec
 50KB     5.78 sec
100KB    23.47 sec
250KB   150.56 sec
500KB   634.99 sec

PHP(7.4.20,cli) pattern : |([\S]*)a\>|
  1KB     0.02 sec
 10KB     0.24 sec
 25KB     1.37 sec
 50KB     5.82 sec
100KB    23.47 sec
250KB   146.89 sec
500KB   647.91 sec

PHP(7.3.28,cli) pattern : |([\S]*)a\>|
  1KB     0.00 sec
 10KB     0.00 sec
 25KB     0.00 sec
 50KB     0.00 sec
100KB     0.00 sec
250KB     0.00 sec
500KB     0.00 sec

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2021-06-20 08:22 UTC] requinix@php.net
-Status: Open +Status: Not a bug
 [2021-06-20 08:22 UTC] requinix@php.net
Inefficient regular expressions that fall prey to backtracking behaviors are not a bug with PHP. Try using a once-only subpattern to write a more performant regex.
 
PHP Copyright © 2001-2021 The PHP Group
All rights reserved.
Last updated: Tue Dec 07 08:03:35 2021 UTC