php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #17114 xml_parse_into_struct takes too long, sometimes crashes IIS
Submitted: 2002-05-09 07:24 UTC Modified: 2005-09-22 21:43 UTC
From: bbisgod at hotmail dot com Assigned:
Status: Not a bug Package: XML related
PHP Version: 4.2.0 OS: Windows NT4
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: bbisgod at hotmail dot com
New email:
PHP Version: OS:

 

 [2002-05-09 07:24 UTC] bbisgod at hotmail dot com
Running PHP4.2.0 on IIS on NT4 server.

I am trying to parse large XML files with PHP.  When I run the code on a smaller 66kb file, it runs with the desired affects.  But when I try to run it on a large file (around 2mb), it takes its time doing xml_parse_into_struct.  Also, sometimes the PHP script can make the memory usage of IIS spiral uncontrollably, and the CPU usage is maxed out.

Please help, or tell me whats wrong

Thanks in advance

INDEX.PHP
---------
<?
include("func.php"); 
$time = getmicrotime();
?>
<HTML>
<BODY>
<FORM method=post action="index.php" enctype="multipart/form-data">
<INPUT type=file name="XML"><BR>
<SELECT size=10 name="folderfile">
<?
$handle=opendir("../xml/");
while ($file = readdir($handle)) {
  if ($file != "." && $file != "..") {
    echo "<option>$file";
  }
}
closedir($handle);
?>
</SELECT>
<INPUT type=submit value='Convert'>
</FORM>
<?
$XML = "";
if (is_uploaded_file($_FILES['XML']['tmp_name'])) {
  $fp = fopen($_FILES['XML']['tmp_name'],"r") or DIE("Error opening XML");
  $XML .= fread($fp,filesize($_FILES['XML']['tmp_name']));
  fclose($fp);
} else {
  if ($_POST['folderfile'] != "") {
    $fp = fopen("../xml/".$_POST['folderfile'],"r") or DIE("Error opening XML");
    $XML .= fread($fp,filesize("../xml/".$_POST['folderfile']));
    fclose($fp);
  }
}
if ($XML != "") {
  echo "Time to read: ".round(getmicrotime()-$time,2)."<BR>";
  $time = getmicrotime();
  if (preg_match("/^<\?xml/i",$XML)) {
    $XML = preg_replace("/^<\?xml.+\?>/","",$XML);
  } else {
    die("Invalid XML format");
  }
  $tree = GetXMLTree($XML);
  unset($XML);
  echo "Time to array: ".round(getmicrotime()-$time,2)."<BR>";
  $time = getmicrotime();
  if (!$tree) {
    die("Invalid XML (Err: Error parseing XML)");
  }
  $path = array_shift($tree);
  array_shift($tree);
  array_shift($tree);
  array_shift($tree);
  foreach($tree as $key => $val) {
    if (!is_array($tree[$key])) {
      DIE("Invalid XML (Err: Loop page)");
    }
    parse_page($tree[$key], $path);
  }
  unset($tree);
  echo "Time to parse: ".round(getmicrotime()-$time,2)."<BR>";
}
?>
</BODY>
</HTML>

--EOF--

FUNC.PHP
--------
<?
function GetChildren($vals, &$i) {
  $children = array();
  if ($vals[$i]['value'])
    array_push($children, $vals[$i]['value']);
  while (++$i < count($vals)) {
    switch ($vals[$i]['type']) {
      case 'cdata':
        array_push($children, $vals[$i]['value']);
      break;
      
      case 'complete':
        $children[$vals[$i]['tag']] = $vals[$i]['value'];
      break;

      case 'open':
        $children[] = GetChildren($vals,$i);
      break;

      case 'close':
        return $children;
      break;
    }
  }
}

function GetXMLTree($XML) {
  echo "Creating parser... ";
  $p = xml_parser_create();
  echo "Setting enctype... ";
  xml_parser_set_option($p, XML_OPTION_TARGET_ENCODING, "UTF-8");
  echo "Parsing... ";
  xml_parse_into_struct($p, stripslashes($XML), $vals, $index);
  echo "Closing parser... ";
  xml_parser_free($p);
  unset($index);
  $i = 0;
  return GetChildren($vals, $i);
}

function getmicrotime() {
  list($usec, $sec) = explode(" ",microtime()); 
  return ((float)$usec + (float)$sec); 
}

function parse_page($page, $folder) {
  $file = $folder."\\".array_shift($page).".LDF<BR>";
  echo "<PRE>";
  print_r($page);
  echo "</PRE>";
}
?>

--EOF--

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2002-05-09 08:03 UTC] bbisgod at hotmail dot com
apologies, I withdraw my bug report.

The problem was 2 things
a) script time outs
b) invalid XML
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 19 07:01:27 2024 UTC