php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #51777 RegEx matching fails
Submitted: 2010-05-09 18:36 UTC Modified: 2010-05-18 08:15 UTC
From: trevor at ridgebizdev dot com Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 5.3.2 OS: Windows XP
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: trevor at ridgebizdev dot com
New email:
PHP Version: OS:

 

 [2010-05-09 18:36 UTC] trevor at ridgebizdev dot com
Description:
------------
When a RegEx "looks" over ~32768 times during a successful match, every RegEx function fails and returns the empty string.



Test script:
---------------
<?php

	$response = http_get("http://www.travelocity.com");
	// no problem with these first 2 RegExs
	$response2 = preg_replace('/\s+/'," ",$response);
	$mytitle = preg_replace('/.*?<\s*title\s*>([^<]*)<.*/i','${1}',$response2);
	echo "\nTitle Match Forward: ".$mytitle."\n\n";
	// now, here's a problem
	$mytitle2 = preg_replace('/.*<\s*title\s*>([^<]*)<.*/i','${1}',$response2);
	echo "\nTitle Match Backward: ".$mytitle2."\n\n";

?>

Expected result:
----------------
$mytitle gets extracted properly and echoed because the RegEx never looks more than 32768 times starting at the beginning of the travelocity.com page source.  $mytitle2 never gets extracted because the RegEx looks more than 32768 times successfully and preg_replace() crashes into the empty string.  Matching forward for the title is working; matching backward for the title is failing for large buffers.

Actual result:
--------------
Title Match Forward: Travelocity Travel: Airline Tickets, Hotels, Flights, Vacations, Cruises &amp; Car Rentals


Title Match Backward: 

Patches

add-bigger-RegEx-engine (last revision 2010-05-09 16:43 UTC by trevor at ridgebizdev dot com)

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-05-09 23:02 UTC] felipe@php.net
-Package: Regexps related +Package: PCRE related
 [2010-05-12 09:14 UTC] mike@php.net
-Status: Open +Status: Feedback
 [2010-05-12 09:14 UTC] mike@php.net
Please try using this snapshot:

  http://snaps.php.net/php5.3-latest.tar.gz
 
For Windows:

  http://windows.php.net/snapshots/

Works here. Do you have a pcre.backtrack_limit set?
 [2010-05-12 20:10 UTC] trevor at ridgebizdev dot com
-Status: Feedback +Status: Open -Type: Bug +Type: Feature/Change Request -Operating System: Windows XP and Linux Server +Operating System: Windows XP
 [2010-05-12 20:10 UTC] trevor at ridgebizdev dot com
[Pcre]
;PCRE library backtracking limit.
; http://php.net/pcre.backtrack-limit
;pcre.backtrack_limit=100000

;PCRE library recursion limit.
;Please note that if you set this value to a high number you may consume all
;the available process stack and eventually crash PHP (due to reaching the
;stack size limit imposed by the Operating System).
; http://php.net/pcre.recursion-limit
;pcre.recursion_limit=100000

I have nothing set (I must be using the default).  I use Windows XP Professional with 4GB RAM and Core2Duo.
 [2010-05-13 04:19 UTC] trevor at ridgebizdev dot com
Setting the pcre.backtrack_limit makes 51777 an installation feature request; however, because no warning is fired, the report remains a bug.
 [2010-05-17 08:22 UTC] mike@php.net
-Status: Open +Status: Bogus
 [2010-05-17 21:24 UTC] trevor at ridgebizdev dot com
preg_last_error() is stellar.  Why is it not used properly by the preg_ functions?
 [2010-05-18 08:15 UTC] trevor at ridgebizdev dot com
preg_last_error() has been available only for months with PHP 5.2.0.  If you notice my style of ${1}, you know I've been using these functions since before 5.2.0.  Who demanded preg_last_error() and when?  Why don't the preg_ functions use preg_last_error()?  preg_last_error() is working, but it's only half of the process of finishing the preg_ functions.  Whether the expression fails to match or if the expression engine crashes (which sounds like failing to match) then the argument string is the return value--not the empty string.  Do you dig it?
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Dec 22 01:01:30 2024 UTC