php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #24175 String overflow? Segmentation faults
Submitted: 2003-06-13 08:55 UTC Modified: 2003-06-26 18:21 UTC
Votes:1
Avg. Score:3.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: justinlong at strategicnetwork dot org Assigned:
Status: No Feedback Package: Reproducible crash
PHP Version: 4.3.2 OS: KRUD/RedHat
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: justinlong at strategicnetwork dot org
New email:
PHP Version: OS:

 

 [2003-06-13 08:55 UTC] justinlong at strategicnetwork dot org
Description:
------------
Have a 50,000 record Postgres database of articles that this code is attempting to process. CGI PHP program takes the HTML file and massages it into a non-HTML subset. Occasional segmentation faults after long runs, and sometimes the following error in the middle of a run:

ll [Fri Jun 13 09:34:23 2003]  Script:  './article-preprocess.php'
---------------------------------------
/usr/local/src/php-4.3.2/ext/standard/string.c(3521) : Block 0x084C9780 status:
Beginning:      OK (allocated on /usr/local/src/php-4.3.2/ext/standard/string.c:3330, 1024 bytes)
      End:      Overflown (magic=0x2A8FCC33 instead of 0x2A8FCC84)
                1 byte(s) overflown
---------------------------------------

51613 Friday, June 6: Back in Court
/usr/local/src/php-4.3.2/ext/standard/string.c(3330) :  Freeing 0x084C97A4 (1024 bytes), script=./article-preprocess.php

Configure line:
./configure --with-pgsql=/usr2/local/pgsql --with-curl=/usr/bin,/usr/shared --with-config-file=/etc --enable-stem --enable-debug



Reproduce code:
---------------
		$article = trim(stripslashes($rec->article));
		if (strlen($article)>512) {
			$article = str_replace("<TD"," <td",$article);
			$article = str_replace("</TD"," </td",$article);
			$article = eregi_replace("[[:cntrl:]]"," ",$article);		// get rid of control characters
			$article = eregi_replace("<P[^>]+>","\n\n\n",$article);
			$article = eregi_replace("<BR[^>]+>","\n\n",$article);
			$article = html_entity_decode($article);					// get rid of HTML entities
			$article = eregi_replace("&[^;]+;"," ",$article);		// get rid of control characters
			if (!empty($article)) {
				$article = strtr($article, "?????????????????????????????????????????????????????????????????????", "SOZsozYYuAAAAAAACEEEEIIIIDNOOOOOOUUUUYsaaaaaaaceeeeiiiionoooooouuuuyy"); 
			}
			if (!empty($article)) {
				$article = strip_tags($article,'<td>');
				$article = " <td>".$article;
				$textlines = split("<td",$article);
				foreach ($textlines as $nextstory) {
					if (strpos($nextstory,">")>0) { $nextstory = substr($nextstory,strpos($nextstory,">")+1); }
					$checklines = split("\n",$nextstory);
					if (count($checklines)>0) {
						$totallength=1;
						$totallines=1;
						$totalsingletones=1;
						for ($y=0;$y<count($checklines);$y++) {
							if (strlen($checklines[$y])>0) { 
								$totallines++; 
								$totallength = $totallength + strlen($checklines[$y]); 
								if ($checklines[$y] == "") { $totalsingletones++; }
							}
						}
						if ($totallength/$totallines>15 && $totalsingletons/$totallines<.5 && strlen($nextstory)>512) { $nextstory = $story .= trim(strip_tags($nextstory))." \n\n"; }
					}
				}
			}
		}


Expected result:
----------------
Should come out on the other end with a large chunk of text from an HTML page representing the article in question. Usually has a run of 90+ entries before the error cited above occurs, and if it runs for 200+ entries before a segmentation fault occurs.

Actual result:
--------------
Backtrace:
NU gdb Red Hat Linux (5.1-1)
Copyright 2001 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...

warning: core file may not match specified executable file.
Core was generated by `/usr/local/bin/php -q ./article-preprocess.php'.
Program terminated with signal 11, Segmentation fault.
#0  0x40259490 in ?? ()
(gdb) bt
#0  0x40259490 in ?? ()
#1  0x402593f4 in ?? ()
#2  0x08106d00 in php_XML_SetStartNamespaceDeclHandler (parser=0x9ae572c, start=0x81be214 <alloc_globals+820>) at /usr/local/src/php-4.3.2/ext/xml/expat/xmlparse.c:1012
#3  0x08116e1d in little2_scanLt (enc=0x9ad13cc, ptr=0x81c5534 "m", end=0x9ad2f1c "?\003", nextTokPtr=0x81ba27c) at /usr/local/src/php-4.3.2/ext/xml/expat/xmltok_impl.c:693
#4  0x0811257e in normal_scanLt (enc=0x9ad401c, ptr=0xbfffa610 "x+\e\b", end=0x1 <Address 0x1 out of bounds>, nextTokPtr=0x81ba27c) at /usr/local/src/php-4.3.2/ext/xml/expat/xmltok_impl.c:743
#5  0x08120daa in p_bracket (p=0x81b2494) at /usr/local/src/php-4.3.2/regex/regcomp.c:620
#6  0x081136c6 in normal_prologTok (enc=0x8, ptr=0x0, end=0x3 <Address 0x3 out of bounds>, nextTokPtr=0x0) at /usr/local/src/php-4.3.2/ext/xml/expat/xmltok_impl.c:1107
#7  0x080f2152 in zif_rawurldecode (ht=-1073745616, return_value=0x812a940, this_ptr=0xbffff168, return_value_used=135442423) at /usr/local/src/php-4.3.2/ext/standard/url.c:528
#8  0x0812b180 in ap_php_cvt (arg=-1.9965403080193083, ndigits=-1073745436, decpt=0x8062346, sign=0x812b6d0, eflag=0, buf=0xbffff1a8 "") at /usr/local/src/php-4.3.2/main/snprintf.c:301
#9  0x401f4657 in ?? ()
(gdb) frame 9
#9  0x401f4657 in ?? ()
(gdb) frame 8
#8  0x0812b180 in ap_php_cvt (arg=-1.9965403080193083, ndigits=-1073745436, decpt=0x8062346, sign=0x812b6d0, eflag=0, buf=0xbffff1a8 "") at /usr/local/src/php-4.3.2/main/snprintf.c:301
301                     while ((fj = arg * 10) < 1) {
(gdb)

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2003-06-13 09:07 UTC] sniper@php.net
Please provide a short but _complete_ stand-alone script.

 [2003-06-13 09:09 UTC] justinlong at strategicnetwork dot org
#!/usr/local/bin/php -q
<?

/*
	This program will take the text in "article" and identify the story to be indexed, placing it in the "story" field
*/

$todaySystem	= mktime();
$todayJulian	= getdate($todaySystem);
$thismorning	= mktime(0,0,0,$todayJulian[mon],$todayJulian[mday],$todayJulian[year]);
$thisweek		= mktime()-(24*60*60*intval($todayJulian['wday']));
$lastyear		= mktime(0,0,0,1,1,$todayJulian[year]-1);
$thisyear		= mktime(0,0,0,1,1,$todayJulian[year]);
$thismonth	= mktime(0,0,0,$todayJulian[mon],1,$todayJulian[year]);
$lastmonth	= mktime(0,0,0,$todayJulian[mon]-1,1,$todayJulian[year]);

$db = pg_connect ("dbname=nsm user=nobody") or die("The Network for Strategic Missions is presently down for software upgrades. Please try again a little later.");

$rs = pg_exec($db,"UPDATE story_progress SET preprocessing=NULL");
$rs = pg_exec($db,"SELECT preprocessing FROM story_progress");
$rec = pg_fetch_object($rs,0);
if ($rec->preprocessing > 0) {
	die();
} else {
	$rs = pg_exec($db,"UPDATE story_progress SET preprocessing=".$todaySystem);
}
set_time_limit(0);

$rs = pg_exec($db,"SELECT count(storyid) from story_headline WHERE article IS NOT NULL and story_preprocessed IS NULL");
$rec = pg_fetch_object($rs,0);
echo $rec->count,' records unprocessed',"\n";

//$rs = pg_exec($db,"SELECT storyid,headline,article from story_headline WHERE storyid=80493 limit 1000");
$rs = pg_exec($db,"SELECT storyid,headline,article from story_headline WHERE article IS NOT NULL and story_preprocessed IS NULL order by storyid desc limit 100");
if ($rs && pg_numrows($rs)>0) {
	for ($x=0;$x<pg_numrows($rs);$x++) {
		$article = "";
		$textlines = "";
		$story = "";
		$rec = pg_fetch_object($rs,$x);
//		echo $rec->storyid,' ',$rec->headline," ";
		$article = trim(stripslashes($rec->article));
		if (strlen($article)>512) {
			$article = str_replace("<TD"," <td",$article);
			$article = str_replace("</TD"," </td",$article);
			$article = eregi_replace("[[:cntrl:]]"," ",$article);		// get rid of control characters
			$article = eregi_replace("<P[^>]+>","\n\n\n",$article);
			$article = eregi_replace("<BR[^>]+>","\n\n",$article);
			$article = html_entity_decode($article);					// get rid of HTML entities
			$article = eregi_replace("&[^;]+;"," ",$article);		// get rid of control characters
			if (!empty($article)) {
				$article = strtr($article, "?????????????????????????????????????????????????????????????????????", "SOZsozYYuAAAAAAACEEEEIIIIDNOOOOOOUUUUYsaaaaaaaceeeeiiiionoooooouuuuyy"); 
			}
			if (!empty($article)) {
				$article = strip_tags($article,'<td>');
				$article = " <td>".$article;
				$textlines = split("<td",$article);
				foreach ($textlines as $nextstory) {
					if (strpos($nextstory,">")>0) { $nextstory = substr($nextstory,strpos($nextstory,">")+1); }
					$checklines = split("\n",$nextstory);
					if (count($checklines)>0) {
						$totallength=1;
						$totallines=1;
						$totalsingletones=1;
						for ($y=0;$y<count($checklines);$y++) {
							if (strlen($checklines[$y])>0) { 
								$totallines++; 
								$totallength = $totallength + strlen($checklines[$y]); 
								if ($checklines[$y] == "") { $totalsingletones++; }
							}
						}
						if ($totallength/$totallines>15 && $totalsingletons/$totallines<.5 && strlen($nextstory)>512) { $nextstory = $story .= trim(strip_tags($nextstory))." \n\n"; }
					}
				}
			}
		}
		if (strlen($story)>512) { echo $rec->headline,"\n"; }
		pg_exec($db,"UPDATE story_headline SET story_preprocessed=$todaySystem, story='".addslashes($story)."' WHERE storyid=".$rec->storyid);
	}
}

$rs = pg_exec($db,"UPDATE story_progress SET preprocessing=NULL");

pg_exec($db,"ANALYZE story_headline");
pg_exec($db,"ANALYZE story_site");

?>
 [2003-06-13 09:16 UTC] sniper@php.net
What in the 'short and _complete_ stand-alone script did you 
not understand? You must give us the exact piece of code that 
causes the crash, NOT the whole script.

 [2003-06-13 09:28 UTC] justinlong at strategicnetwork dot org
On the basis of your snippy reply I presume I have not communicated clearly the error that I am encountering. This is the only script which causes this problem. The point where it occurs is toward the end at the line

if (strlen($story)>512) { echo $rec->headline,"\n"; }
pg_exec($db,"UPDATE story_headline SET
story_preprocessed=$todaySystem, story='".addslashes($story)."' WHERE
storyid=".$rec->storyid);

But obviously it's not these two lines which cause the problem since the crash does not occur on one specific entry in the database but rather at different points in the run each time. This the only script which crashes; I don't encounter this problem anywhere else.
 [2003-06-18 13:20 UTC] sniper@php.net
Please provide a short, self-contained script that we can
copy'n'paste and run ourselves. Anything that uses external resources, such as databases, is useless.

 [2003-06-26 18:21 UTC] sniper@php.net
No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Open". Thank you.


 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 16:01:28 2024 UTC