php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #41296 Out of memory when parsing large files
Submitted: 2007-05-05 13:40 UTC Modified: 2007-05-05 14:57 UTC
From: kiri at swol dot de Assigned:
Status: Not a bug Package: Performance problem
PHP Version: 5.2.2 OS: FreeBSD 6.1 i386
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: kiri at swol dot de
New email:
PHP Version: OS:

 

 [2007-05-05 13:40 UTC] kiri at swol dot de
Description:
------------
I have a 73MB large xml file which will be parsed via command line with  the following script.

Reproduce code:
---------------
set_time_limit(600); error_reporting(E_ALL);
ini_set('memory_limit',-1);
$ricxml2 = simplexml_load_file('path/to/file.xml');
$ricxml = xml2php($ricxml2);
function xml2php($xml) {
	$fils = 0;
	$tab = false;
	$array = array();
	foreach($xml->children() as $key => $value){ 
		$child = xml2php($value);
		/* To deal with the attributes */
		foreach($value->attributes() as $ak=>$av){ $child[$ak] = (string)$av; }
		/* Let see if the new child is not in the array */
		if($tab==false && in_array($key,array_keys($array))){
			/* If this element is already in the array we will create an indexed array */
			$tmp = $array[$key]; $array[$key] = NULL; $array[$key][] = $tmp; $array[$key][] = $child;
			$tab = true;
		}elseif($tab == true){
			/* Add an element in an existing array */
			$array[$key][] = $child;
		}else{ /* Add a simple element */ $array[$key] = $child; }
	$fils++; 
	}
	return $array;
} 


Expected result:
----------------
just a converted xml structure  .... it worked with files below 66 MB

Actual result:
--------------
the error occured in this line:
in_array($key,array_keys($array))


Fatal error: Out of memory (allocated 217317376) (tried to allocate 16 bytes) in parse_ric_simplexml_initial.php5 on line 38


on the whole freebsd machine all other programms/servers are stopped so that 1.4 GB RAM is available for the script.

last pid: 79586;  load averages:  0.65,  0.26,  0.18
26 processes:  2 running, 24 sleeping
CPU states: 42.2% user,  0.0% nice,  8.1% system,  0.0% interrupt, 49.7% idle
Mem: 506M Active, 383M Inact, 204M Wired, 75M Cache, 112M Buf, 332M Free
Swap: 3000M Total, 192K Used, 3000M Free

  PID USERNAME  THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
79585 root        1 107    0   490M   487M CPU1   0   0:33 98.23% php5
  513 root        1  96    0  3552K  1400K select 0   0:32  0.00% nmbd
  434 root        1  96    0  3420K  2044K select 0   0:28  0.00% sendmail
37085 root        1   4    0  5440K  1836K sbwait 0   0:10  0.00% sshd


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2007-05-05 14:57 UTC] iliaa@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

simplexml_load_file() uses DOM parsing mechanism which requires parsing 
of the entire file and building of a DOM tree. On a large XML file this 
can take a lot of memory. I would recommend using xmlreader that uses a 
pull parser or xml extension which uses sax that does not require the 
entire file to be loaded into memory.
 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Sun Aug 09 09:01:24 2020 UTC