php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #46066 Regular Expression differences between 4.4 and 5.2
Submitted: 2008-09-12 14:05 UTC Modified: 2008-09-15 08:27 UTC
From: ewen dot cumming at gmail dot com Assigned:
Status: Not a bug Package: PCRE related
PHP Version: 5.2CVS-2008-09-12 (snap) OS: Debian Linux
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: ewen dot cumming at gmail dot com
New email:
PHP Version: OS:

 

 [2008-09-12 14:05 UTC] ewen dot cumming at gmail dot com
Description:
------------
In upgrading our code base from PHP4.4 to 5.2 I found the same regular exression would give different results. 

Apologies for the large include string and results, however if I reduce the input string any more the problem doesn't occur.

I have tested with:
PHP 4.4.4-8+etch6 (cli) (built: May 16 2008 15:59:34)
Zend Engine v1.3.0

PHP 5.2.0-8+etch11 (cli) (built: May 10 2008 10:46:24)
Zend Engine v2.2.0

And before submitting the bug:

PHP 5.2.7-dev (cli) (built: Sep 12 2008 15:05:09) 
Zend Engine v2.2.0

Reproduce code:
---------------
<?php
include( 'http://www.workingweb.nl/example/input.inc' );

$pattern = "/<!T_([^> ]+)([^>]*)>(.*?)<!T_end\\1>|<!T_([^> ]+)([^>]*)>/si";
preg_match_all( $pattern, $string, $matches );

var_dump($matches);
?>

Expected result:
----------------
This is what I get in PHP4.4 (and what I would expect after an PHP5.2 upgrade). 

array(6) {
  [0]=>
  array(22) {
    [0]=>
    string(31) "<!T_lang_searchforpublications>"
    [1]=>
    string(15) "<!T_lang_title>"
    [2]=>
    string(11) "<!T_fTitle>"
    [3]=>
    string(16) "<!T_lang_author>"
    [4]=>
    string(12) "<!T_fAuthor>"
    [5]=>
    string(17) "<!T_lang_session>"
    [6]=>
    string(13) "<!T_fSession>"
    [7]=>
    string(17) "<!T_lang_summary>"
    [8]=>
    string(13) "<!T_butCheck>"
    [9]=>
    string(21) "<!T_lang_showsummary>"
    [10]=>
    string(18) "<!T_lang_language>"
    [11]=>
    string(25) "<!T_languageselectwidget>"
    [12]=>
    string(20) "<!T_lang_daterange1>"
    [13]=>
    string(15) "<!T_lang_start>"
    [14]=>
    string(20) "<!T_startdateWidget>"
    [15]=>
    string(20) "<!T_lang_daterange2>"
    [16]=>
    string(13) "<!T_lang_end>"
    [17]=>
    string(18) "<!T_enddateWidget>"
    [18]=>
    string(25) "<!T_lang_publicationtype>"
    [19]=>
    string(21) "<!T_pubtypeselectbox>"
    [20]=>
    string(16) "<!T_lang_search>"
    [21]=>
    string(15) "<!T_lang_clear>"
  }
  [1]=>
  array(22) {
    [0]=>
    string(0) ""
    [1]=>
    string(0) ""
    [2]=>
    string(0) ""
    [3]=>
    string(0) ""
    [4]=>
    string(0) ""
    [5]=>
    string(0) ""
    [6]=>
    string(0) ""
    [7]=>
    string(0) ""
    [8]=>
    string(0) ""
    [9]=>
    string(0) ""
    [10]=>
    string(0) ""
    [11]=>
    string(0) ""
    [12]=>
    string(0) ""
    [13]=>
    string(0) ""
    [14]=>
    string(0) ""
    [15]=>
    string(0) ""
    [16]=>
    string(0) ""
    [17]=>
    string(0) ""
    [18]=>
    string(0) ""
    [19]=>
    string(0) ""
    [20]=>
    string(0) ""
    [21]=>
    string(0) ""
  }
  [2]=>
  array(22) {
    [0]=>
    string(0) ""
    [1]=>
    string(0) ""
    [2]=>
    string(0) ""
    [3]=>
    string(0) ""
    [4]=>
    string(0) ""
    [5]=>
    string(0) ""
    [6]=>
    string(0) ""
    [7]=>
    string(0) ""
    [8]=>
    string(0) ""
    [9]=>
    string(0) ""
    [10]=>
    string(0) ""
    [11]=>
    string(0) ""
    [12]=>
    string(0) ""
    [13]=>
    string(0) ""
    [14]=>
    string(0) ""
    [15]=>
    string(0) ""
    [16]=>
    string(0) ""
    [17]=>
    string(0) ""
    [18]=>
    string(0) ""
    [19]=>
    string(0) ""
    [20]=>
    string(0) ""
    [21]=>
    string(0) ""
  }
  [3]=>
  array(22) {
    [0]=>
    string(0) ""
    [1]=>
    string(0) ""
    [2]=>
    string(0) ""
    [3]=>
    string(0) ""
    [4]=>
    string(0) ""
    [5]=>
    string(0) ""
    [6]=>
    string(0) ""
    [7]=>
    string(0) ""
    [8]=>
    string(0) ""
    [9]=>
    string(0) ""
    [10]=>
    string(0) ""
    [11]=>
    string(0) ""
    [12]=>
    string(0) ""
    [13]=>
    string(0) ""
    [14]=>
    string(0) ""
    [15]=>
    string(0) ""
    [16]=>
    string(0) ""
    [17]=>
    string(0) ""
    [18]=>
    string(0) ""
    [19]=>
    string(0) ""
    [20]=>
    string(0) ""
    [21]=>
    string(0) ""
  }
  [4]=>
  array(22) {
    [0]=>
    string(26) "lang_searchforpublications"
    [1]=>
    string(10) "lang_title"
    [2]=>
    string(6) "fTitle"
    [3]=>
    string(11) "lang_author"
    [4]=>
    string(7) "fAuthor"
    [5]=>
    string(12) "lang_session"
    [6]=>
    string(8) "fSession"
    [7]=>
    string(12) "lang_summary"
    [8]=>
    string(8) "butCheck"
    [9]=>
    string(16) "lang_showsummary"
    [10]=>
    string(13) "lang_language"
    [11]=>
    string(20) "languageselectwidget"
    [12]=>
    string(15) "lang_daterange1"
    [13]=>
    string(10) "lang_start"
    [14]=>
    string(15) "startdateWidget"
    [15]=>
    string(15) "lang_daterange2"
    [16]=>
    string(8) "lang_end"
    [17]=>
    string(13) "enddateWidget"
    [18]=>
    string(20) "lang_publicationtype"
    [19]=>
    string(16) "pubtypeselectbox"
    [20]=>
    string(11) "lang_search"
    [21]=>
    string(10) "lang_clear"
  }
  [5]=>
  array(22) {
    [0]=>
    string(0) ""
    [1]=>
    string(0) ""
    [2]=>
    string(0) ""
    [3]=>
    string(0) ""
    [4]=>
    string(0) ""
    [5]=>
    string(0) ""
    [6]=>
    string(0) ""
    [7]=>
    string(0) ""
    [8]=>
    string(0) ""
    [9]=>
    string(0) ""
    [10]=>
    string(0) ""
    [11]=>
    string(0) ""
    [12]=>
    string(0) ""
    [13]=>
    string(0) ""
    [14]=>
    string(0) ""
    [15]=>
    string(0) ""
    [16]=>
    string(0) ""
    [17]=>
    string(0) ""
    [18]=>
    string(0) ""
    [19]=>
    string(0) ""
    [20]=>
    string(0) ""
    [21]=>
    string(0) ""
  }
}


Actual result:
--------------
Result in PHP5.2

array(6) {
  [0]=>
  array(0) {
  }
  [1]=>
  array(0) {
  }
  [2]=>
  array(0) {
  }
  [3]=>
  array(0) {
  }
  [4]=>
  array(0) {
  }
  [5]=>
  array(0) {
  }
}


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-09-12 14:11 UTC] ewen dot cumming at gmail dot com
Note there should be a space in my reproduce code regex (where the line break happened to occur):

The below should set it right:

$pattern = "/<!T_([^> ]+)([^>]*)>(.*?)<!T_end\\1>".
"|<!T_([^> ]+)([^>]*)>/si";
 [2008-09-12 14:29 UTC] felipe@php.net
Use var_dump(preg_last_error()) after preg_match_all() to check if any the problems mentioned in the documentation (http://docs.php.net/preg-last-error) has occured.

 [2008-09-15 06:46 UTC] ewen dot cumming at gmail dot com
Using pgreg_last_error shows that backtrack is exausted - increasing from 100000 to 150000 in php.ini fixes problem. 

Thanks for your response, apologies for the support question.
 [2008-09-15 08:27 UTC] jani@php.net
Problem solved, no bug.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Mar 28 15:01:29 2024 UTC