php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #33084 the modifier U works incorrectly
Submitted: 2005-05-20 16:48 UTC Modified: 2005-06-07 01:00 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: cs at scanner dot de Assigned:
Status: No Feedback Package: PCRE related
PHP Version: 4.3.11 OS: linux suse
Private report: No CVE-ID: None
 [2005-05-20 16:48 UTC] cs at scanner dot de
Description:
------------
preg_match_all('#<envelope>(.+)</envelope>#smiU',$ricxml,&$envelopes);

i have a xml file around 40,5 MB huge. when i parse the text $ricxml through the preg_match_all fnc then the results will be empty arrays instead of arrays filled with the text.

that means if i do not use the U modifier then p_m_a will find the first match. with the U modifier p_m_a will not find any matches.

for command line:
Configure Command =>  './configure' '--with-mysql=/usr/mysql' '--with-zlib' '--enable-sysvshm=yes' '
--enable-sysvsem=yes' '--enable-shmop=shared' '--with-config-file-path=/etc' '--enable-track-vars=ye
s' '--enable-url-includes' '--with-gd=/usr/local' '--with-jpeg-dir=/usr/local/lib' '--with-pdflib=/u
sr/local' '--with-zlib-dir=/usr/lib' '--enable-ftp' '--with-pear' '--with-openssl=/usr/local/openssl
' '--enable-calendar' '--with-crack=/usr/local' '--with-curl=/usr/local' '--enable-dbase' '--enable-
dio' '--enable-exif' '--with-mcrypt=/usr/local' '--with-mhash=/usr/local' '--enable-sockets' '--enab
le-wddx' '--enable-xml' '--disable-magic-quotes' '--enable-pcntl' '--with-mssql=/usr/local/freetds'

Expected result:
----------------
array(4) {
  [0]=>
  array(1) {
    [0]=>
    string(41523804) "<Envelope>
....


Actual result:
--------------
without the U modifier
preg_match_all('#<envelope>(.+)</envelope>#smi',$ricxml,&$envelopes);

-- size of xml file: 41524017 Bytes
array(2) {
  [0]=>
  array(0) {
  }
  [1]=>
  array(0) {
  }
}

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-05-20 16:52 UTC] tony2001@php.net
Thank you for this bug report. To properly diagnose the problem, we
need a short but complete example script to be able to reproduce
this bug ourselves. 

A proper reproducing script starts with <?php and ends with ?>,
is max. 10-20 lines long and does not require any external 
resources such as databases, etc.

If possible, make the script source available online and provide
an URL to it here. Try to avoid embedding huge scripts into the report.


 [2005-05-20 17:53 UTC] cs at scanner dot de
$iterations = 46296; // 46297 --> error
$ricxml = array();
$ricxml[] = '<PRODAT>';
for($i=0;$i<$iterations;$i++){
	/* this is just garbage to produce really big strings !!!! */
	$ricxml[] = '<MessageVersion>2</MessageVersion><GeneratingDateTime>2005-05-20T09:47:23</GeneratingDateTime><MessageFunction>4</MessageFunction><Product xsi:type="ProductTypeChange"><Activity>3</Activity><CPC>4021312017957</CPC>';
}
$ricxml[] = '</PRODAT>';
$ricxml[] = '<PRODAT>';
for($i=0;$i<10000;$i++){
	/* this is just garbage to produce really big strings !!!! */
	$ricxml[] = '<Product xsi:type="ProductTypeChange"><Activity>3</Activity><CPC>4021312154638</CPC><ExpirationDate>2005-05-18</ExpirationDate></Product>';
}
$ricxml[] = '</PRODAT>';
$ricxml = implode("\r\n",$ricxml);
echo 'Size: '.strlen($ricxml)."\n";
preg_match_all('#<PRODAT>(.+)</PRODAT>#smiU',$ricxml,$envelopes);
echo 'Strlen of serialize: '.strlen(serialize($envelopes))."\n";
/* if the $ricxml string is greater the 22,7MB the result of Strlen from serialize is 26 (the $envelopes is empty otherwise the real strlen will be returned */
/* if you remove the modifier U and $iteration is greater than 46300 in the pattern then preg_match_all will find ONE occurenc */
 [2005-06-07 01:00 UTC] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Oct 15 18:01:27 2024 UTC