php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #18376 After an abitrary number of nodes DOMXML dies on large XML documents
Submitted: 2002-07-16 16:02 UTC Modified: 2002-07-17 17:11 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:1 (100.0%)
From: watford at uiuc dot edu Assigned: jtate (profile)
Status: Closed Package: DOM XML related
PHP Version: 4.2.1 latest OS: Windows 2000 Pro
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: watford at uiuc dot edu
New email:
PHP Version: OS:

 

 [2002-07-16 16:02 UTC] watford at uiuc dot edu
When I run my script that parses a DOMXML Object into smaller more managable objects, it will die on long XML documents [40+ elements] with the error: Unhanded exception in Apache.exe (PHP4TS.DLL): 0xC0000005: Access Violation.  The Apache version is 2.0.36, and I have yet to try this with the 1.3.X branch of Apache.  Here is the output from PHP4TS.DLL where it died:

   006A6391   mov         esi,dword ptr [esp+8]
   006A6395   mov         eax,dword ptr [esi]
-> 006A6397   dec         word ptr [eax+0Ah]
   006A639B   mov         eax,dword ptr [esi]
   006A639D   mov         cx,word ptr [eax+0Ah]
   006A63A1   test        cx,cx

Here is my PHP Script:
<?php

class xmlNodeProperties
{
	function xmlNodeProperties(/*void*/)
	{ }
	
	/*String*/ function getAttribute(/*String*/ $key)
	{ return $this->$key; }
	
	/*void*/ function setAttribute(/*String*/ $key, /*String*/ $val)
	{ $this->$key = $val; }
}

class xmlNode extends xmlNodeProperties
{
	var /*String*/		$nodeName;
	var /*String*/		$cdata;
	var /*xmlNode[]*/	$childNodes;
	
	function xmlNode(/*void*/)
	{
		$this->xmlNodeProperties();
	}
}

class CXMLParser
{
	var /*xmlNode*/		$root;

	function CXMLParser(/*void*/)
	{
		if(!$this->use_file) {
			/*domxmlTree*/ $t = xmltree($this->xml_str);
			$this->xml_str = "";
		} else {
			/*domxmlTree*/ $t = xmltree(domxml_dump_mem(xmldocfile($this->file)));
		}
		$this->root = new xmlNode();
		$this->root->cdata = $this->get_cdata($t->children[0]);
		$this->root->nodeName = $t->children[0]->tagname;
		
		foreach($t->children[0]->attributes as /*domxmlNode*/ $a)
		{
			$this->root->setAttribute($a->name, $a->value);
		}
		
		foreach($t->children[0]->children as /*domxmlNode*/ $c)
		{
			/*xmlNode*/ $n = $this->parse($c);
			if($n === false) { }
			else {
				$this->root->childNodes[] = $n;
			}
		}
		
	}
	
	/*xmlNode*/ function parse(/*domxmlNode*/ $domnode)
	{
		if($domnode->type != 1) return false;
		/*xmlNode*/ $node = new xmlNode();
		$node->cdata = $this->get_cdata($domnode);
		$node->nodeName = $domnode->tagname;
		
		foreach($domnode->attributes as /*domxmlNode*/ $a)
		{
			$node->setAttribute($a->name, $a->value);
		}
		
		foreach($domnode->children as /*domxmlNode*/ $c)
		{
			/*xmlNode*/ $n = $this->parse($c);
			if($n === false) { }
			else {
				$node->childNodes[] = $n;
			}
		}
		
		return $node;
	}
	
	/*String*/ function get_cdata(/*domxmlNode*/ $node)
	{
		foreach($node->children as /*domxmlNode*/ $c)
		{
			if($c->type == 3)
			{
				return $c->content;
			}
		}
		
		return "";
	}
}

class CXMLDocument extends CXMLParser
{
	var /*String*/		$file;
	var /*String*/		$xml_str;
	var /*int*/		$use_file;
	
	function CXMLDocument(/*...*/)
	{
		/*Array*/ $arglist = func_get_args();
		if( !count($arglist) ) {
			$this->xml_str = "";
			$this->file = "";
			$this->use_file = 0;
		} else if( count($arglist) < 2 ) {
			$this->xml_str = $arglist[0];
			$this->file = "";
			$this->use_file = 0;
		} else {
			$this->file = $arglist[0];
			$this->use_file = $arglist[1];
			if(file_exists($this->file)) {
				/*domxmlDocument*/ $d = domxml_open_file($this->file);
				if($d === false) { die("bad xml file"); }
			} else {
				die("bad file name or file does not exist");
			}
		}
		
		$this->CXMLParser();
	}
}

$xmlDoc = new CXMLDocument('
<columns>
	<date name="Adv_Plan">Advanced Planning Meeting</date>
	<date name="Rec_Pre_EUP">Receive Preliminary EUP</date>
	<date name="Start_Prel_FC">Start Preliminary Fuel Cycle</date>
	<date name="Send_Prel_FC">Send Preliminary Fuel Cycle</date>
	<date name="Rec_Rel_EUP">Receive Release EUP</date>
	<date name="Start_Rel_FC">Start Release Fuel Cycle</date>
	<date name="Send_Rel_FC" important="1">Send Release Fuel Cycle</date>
	<date name="Send_Start_Rep" important="1">Send Startup Report</date>
	<date name="Final_eWA">Final eWA</date>
	<date name="LicKickoff">Licensing Kickoff Meeting</date>
	<date name="Mini_Review">Mini Review</date>
	<date name="Send_Dft_FRED">Send Draft FRED</date>
	<date name="Send_Dft_OPL">Send Draft OPL</date>
	<date name="Rec_FRED">Receive Customer FRED</date>
	<date name="SEND_OPL" allcaps="1">Send Customer OPL3</date>
	<date name="Rec_OPL">Receive Customer OPL3</date>
	<date name="SEND_FRED" allcaps="1">Send Customer FRED</date>
	<date name="Download">Download to Manufacturing</date>
	<date name="RLP">RLP</date>
	<date name="SR1" important="1">SLMCPR</date>
	<date name="Send_Drf_SRLR">Send Draft SRLR</date>
	<date name="Rec_Comments_SRLR">Recieve Comments on SRLR</date>
	<date name="Send_Final_SRLR" important="1">Send Final SRLR</date>
	<date name="SEND_PREL_PCDB" allcaps="1">Send Preliminary PCDB</date>
	<date name="PCDB" important="1">Send Final PCDB</date>
	<date name="CMR" important="1">Send CMR</date>
	<date name="SEND_CMR1" allcaps="1">Send CMR Revision 1</date>
	<date name="SEND_CMR2" allcaps="1">Send CMR Revision 2</date>
	<date name="Rec_Cus_EOC">Receive Customer EOC Data</date>
	<date name="SR2">Special Requirements 1</date>
	<date name="SR3">Special Requirements 2</date>
	<date name="SR4">Special Requirements 3</date>
	<date name="SR5">Special Requirements 4</date>
	<date name="Eigen_Rev">Eigenvalue Review</date>
	<date name="FSDD">FSDD</date>
	<date name="Loca_TSD">LOCA TSD</date>
	<date name="Loca_Data">LOCA Interface Data</date>
	<date name="Loca_Start">Start LOCA Analysis</date>
	<date name="Loca_Fin">Complete LOCA Analysis</date>
	<date name="Stab_TSD">Stability TSD</date>
	<date name="Stab_Data">Stability Interface Data</date>
	<date name="Stab_Start">Start Stability Analysis</date>
	<date name="Stab_Fin">Complete Stability Analysis</date>
	<date name="Trans_TSD">Transient TSD</date>
	<date name="TRANS_DATA" allcaps="1">Transient Interface Data</date>
	<date name="Trans_Sel_Rev">Transient Selection Review</date>
	<date name="Trans_Start">Start Transient Analysis</date>
	<date name="Trans_Fin">Complete Transient Analysis</date>
	<date name="TRANS_WRAPUP" allcaps="1">Transient Wrapups</date>
	<date name="BNDL_REP" allcaps="1">Bundle Announcement Report</date>
	<date name="CMIT" allcaps="1">CMIT</date>
	<date name="SEND_OPL7" allcaps="1">Send OPL7</date>
	<date name="REC_OPL7" allcaps="1">Recieve Customer OPL7</date>
	<date name="RES_OPL7" allcaps="1">Resolved OPL7</date>
	<date name="SEND_OPL4" allcaps="1">Send Customer OPL4</date>
	<date name="REC_OPL4" allcaps="1">Receive Customer OPL4</date>
</columns>');
?>

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2002-07-16 16:03 UTC] watford at uiuc dot edu
As another note it dies AFTER it sends the data to the browser, and after it outputs the data to STDOUT [if run via php.exe].
 [2002-07-16 16:05 UTC] watford at uiuc dot edu
I got ahold of the register values at that point, but I can't get the contents of memory:

 EAX = 150082D0 EBX = 0060E6C0 ECX = 00000000 EDX = 14C0FAE4
 ESI = 14C0FAB0 EDI = 0060E6C0 EIP = 006A6397 ESP = 14C0FAA0
 EBP = 14C0FBB4 EFL = 00000202 CS = 001B DS = 0023 ES = 0023 SS = 0023
 FS = 0038 GS = 0000 OV=0 UP=0 EI=1 PL=0 ZR=0 AC=0 PE=0 CY=0

 150082DA = ????

 ST0 = +0.00000000000000000e+0000 ST1 = +9.19248417784877260e+4083
 ST2 = +0.00000000000000000e+0000 ST3 = +0.00000000000000000e+0000
 ST4 = +0.00000000000000000e+0000 ST5 = +0.00000000000000000e+0000
 ST6 = -6.14161125457030721e+0005 ST7 = 1#QNAN                    
 CTRL = 027F STAT = 4020 TAGS = FFFF EIP = 00C8D934 CS = 001B DS = 0023
 EDO = 00CD1AC8
 [2002-07-16 16:15 UTC] jtate@php.net
Can you try this with the current CVS snapshot available from http://snaps.php.net.  
 [2002-07-16 17:28 UTC] chregu@php.net
Instead of the latest devel snapshot you can also try the latest _stable_ snapshot (which will become 4.2.2 in the hopefully not so distant future). It is fixed there as well.

Please reopen it, if the problem still persist.

chregu


 [2002-07-17 13:59 UTC] watford at uiuc dot edu
After installing the latest stable release from snaps I get this error when starting Apache2:

C:\Program Files\Apache Group\Apache2\bin>apache.exe -f ..\conf\httpd.conf -d ..
\.
apache.exe: module "c:\php4build\snap\sapi\apache2filter\sapi_apache2.c" is not
compatible with this version of Apache.
Please contact the vendor for the correct version.

I'm trying to test the newest version to see if it corrected the problem, but everything I've tried has failed to work :/
 [2002-07-17 14:02 UTC] sniper@php.net
get apache 1.3.26, apache 2 is not stable for any kind of usage yet.

 [2002-07-17 14:40 UTC] watford at uiuc dot edu
Using Apache/1.3.26 and the latest release from the snaps, it still dies with the following messages:

The instruction at "0x10096397" referenced memory at "0x03e1813a". The memory could not be "written".
The instruction at "0x10096397" referenced memory at "0x03e181a2". The memory could not be "written".
The instruction at "0x10096397" referenced memory at "0x03e18072". The memory could not be "written".
Unhandled Exception in Apache.exe (PHP4TS.DLL): 0xC0000005: Access Violation.

Disassembly:
   1009E6C0   push        esi
   1009E6C1   mov         esi,dword ptr [esp+8]
   1009E6C5   mov         eax,dword ptr [esi]
-> 1009E6C7   dec         word ptr [eax+0Ah]
   1009E6CB   mov         eax,dword ptr [esi]
   1009E6CD   mov         cx,word ptr [eax+0Ah]
   1009E6D1   test        cx,cx

Registers:
 EAX = 03E18068 EBX = 00A97940 ECX = 00000000 EDX = 00CDFA10
 ESI = 00CDF9DC EDI = 00A97940 EIP = 1009E6C7 ESP = 00CDF9CC
 EBP = 00CDFAE0 EFL = 00000202 CS = 001B DS = 0023 ES = 0023 SS = 0023
 FS = 0038 GS = 0000 OV=0 UP=0 EI=1 PL=0 ZR=0 AC=0 PE=0 CY=0

 03E18072 = ????

 ST0 = +0.00000000000000000e+0000 ST1 = +0.70443418271368093e+4085
 ST2 = -0.00391334816692398e+4421 ST3 = +0.00000000000000000e+0000
 ST4 = +0.00000000000000000e+0000 ST5 = +0.00635391772729999e+4930
 ST6 = -2.09327663353178650e+0005 ST7 = 1#QNAN                    
 CTRL = 027F STAT = 4020 TAGS = FFFF EIP = 00B8D934 CS = 001B DS = 0023
 EDO = 00BD1AC8

Call Stack:
-> PHP4TS! 1009e6c7()
   PHP4TS! 1009ac4e()
   PHP4TS! 10002e35()
   PHP4APACHE! 60002be8()
   PHP4APACHE! 6000181b()
   PHP4APACHE! 600014ee()
 [2002-07-17 15:04 UTC] watford at uiuc dot edu
This script is from a previous bug report, it also causes PHP to crash.  Every crash happens after the document is done outputting, with the same errors, in the same place in the disassembly.  Maybe its that double-freeing error, but I have read that it was fixed.

<?php
	$doc = new_xmldoc( "1.0" );
	$root = $doc->add_root("document");
	for($i = 1; $i < 1000; $i++){
		$element = $doc->create_element("element");
		$element->set_content("content ".$i);
		$root->append_child($element);
	}
	$xml = $doc->dumpmem();
	echo htmlspecialchars($xml);
?>
 [2002-07-17 16:05 UTC] chregu@php.net
Can someone verify this on Windows with stable-latest? And if there really is a problem, fix it? :)

I couldn't reproduce it on Linux, there was not even a memory hole like last time (tested with 100'000 appended childs...)

chregu
 [2002-07-17 16:35 UTC] jtate@php.net
I fixed this.  I'll verify against PHP4_2_0 though, just to make sure.  It could be something different.
 [2002-07-17 17:11 UTC] jtate@php.net
This bug has been fixed in CVS. You can grab a snapshot of the
CVS version at http://snaps.php.net/. In case this was a documentation 
problem, the fix will show up soon at http://www.php.net/manual/.
In case this was a PHP.net website problem, the change will show
up on the PHP.net site and on the mirror sites.
Thank you for the report, and for helping us make PHP better.

Yes, this is fixed.

<?php
	$doc = new_xmldoc( "1.0" );
	$root = $doc->add_root("document");
	for($i = 1; $i < 10000; $i++){
		$element = $doc->create_element("element");
		$element->set_content("content ".$i);
		$root->append_child($element);
	}
	$xml = $doc->dumpmem();
	echo htmlspecialchars($xml);
?>

Runs just fine.  I'm using Apache 1.3.24 with the CVS version of 4_2_0.  Also looks fixed in HEAD.

Remember that when you use a SNAPS version, you've got to copy the php_domxml.dll to the proper directory.  You were probably still testing against the one in 4.2.1.  Make sure that you've copied the file from the snapshot, restarted Apache, and if it still happens, e-mail me personally.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Apr 25 17:01:29 2024 UTC