php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #41296 Out of memory when parsing large files
Submitted: 2007-05-05 13:40 UTC Modified: 2007-05-05 14:57 UTC
From: kiri at swol dot de Assigned:
Status: Not a bug Package: Performance problem
PHP Version: 5.2.2 OS: FreeBSD 6.1 i386
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: kiri at swol dot de
New email:
PHP Version: OS:

 

 [2007-05-05 13:40 UTC] kiri at swol dot de
Description:
------------
I have a 73MB large xml file which will be parsed via command line with  the following script.

Reproduce code:
---------------
set_time_limit(600); error_reporting(E_ALL);
ini_set('memory_limit',-1);
$ricxml2 = simplexml_load_file('path/to/file.xml');
$ricxml = xml2php($ricxml2);
function xml2php($xml) {
	$fils = 0;
	$tab = false;
	$array = array();
	foreach($xml->children() as $key => $value){ 
		$child = xml2php($value);
		/* To deal with the attributes */
		foreach($value->attributes() as $ak=>$av){ $child[$ak] = (string)$av; }
		/* Let see if the new child is not in the array */
		if($tab==false && in_array($key,array_keys($array))){
			/* If this element is already in the array we will create an indexed array */
			$tmp = $array[$key]; $array[$key] = NULL; $array[$key][] = $tmp; $array[$key][] = $child;
			$tab = true;
		}elseif($tab == true){
			/* Add an element in an existing array */
			$array[$key][] = $child;
		}else{ /* Add a simple element */ $array[$key] = $child; }
	$fils++; 
	}
	return $array;
} 


Expected result:
----------------
just a converted xml structure  .... it worked with files below 66 MB

Actual result:
--------------
the error occured in this line:
in_array($key,array_keys($array))


Fatal error: Out of memory (allocated 217317376) (tried to allocate 16 bytes) in parse_ric_simplexml_initial.php5 on line 38


on the whole freebsd machine all other programms/servers are stopped so that 1.4 GB RAM is available for the script.

last pid: 79586;  load averages:  0.65,  0.26,  0.18
26 processes:  2 running, 24 sleeping
CPU states: 42.2% user,  0.0% nice,  8.1% system,  0.0% interrupt, 49.7% idle
Mem: 506M Active, 383M Inact, 204M Wired, 75M Cache, 112M Buf, 332M Free
Swap: 3000M Total, 192K Used, 3000M Free

  PID USERNAME  THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
79585 root        1 107    0   490M   487M CPU1   0   0:33 98.23% php5
  513 root        1  96    0  3552K  1400K select 0   0:32  0.00% nmbd
  434 root        1  96    0  3420K  2044K select 0   0:28  0.00% sendmail
37085 root        1   4    0  5440K  1836K sbwait 0   0:10  0.00% sshd


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2007-05-05 14:57 UTC] iliaa@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

simplexml_load_file() uses DOM parsing mechanism which requires parsing 
of the entire file and building of a DOM tree. On a large XML file this 
can take a lot of memory. I would recommend using xmlreader that uses a 
pull parser or xml extension which uses sax that does not require the 
entire file to be loaded into memory.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Wed Jan 15 13:01:29 2025 UTC