php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #17114 xml_parse_into_struct takes too long, sometimes crashes IIS
Submitted: 2002-05-09 07:24 UTC Modified: 2005-09-22 21:43 UTC
From: bbisgod at hotmail dot com Assigned:
Status: Not a bug Package: XML related
PHP Version: 4.2.0 OS: Windows NT4
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please !
Your email address:
MUST BE VALID
Solve the problem:
38 - 17 = ?
Subscribe to this entry?

 
 [2002-05-09 07:24 UTC] bbisgod at hotmail dot com
Running PHP4.2.0 on IIS on NT4 server.

I am trying to parse large XML files with PHP.  When I run the code on a smaller 66kb file, it runs with the desired affects.  But when I try to run it on a large file (around 2mb), it takes its time doing xml_parse_into_struct.  Also, sometimes the PHP script can make the memory usage of IIS spiral uncontrollably, and the CPU usage is maxed out.

Please help, or tell me whats wrong

Thanks in advance

INDEX.PHP
---------
<?
include("func.php"); 
$time = getmicrotime();
?>
<HTML>
<BODY>
<FORM method=post action="index.php" enctype="multipart/form-data">
<INPUT type=file name="XML"><BR>
<SELECT size=10 name="folderfile">
<?
$handle=opendir("../xml/");
while ($file = readdir($handle)) {
  if ($file != "." && $file != "..") {
    echo "<option>$file";
  }
}
closedir($handle);
?>
</SELECT>
<INPUT type=submit value='Convert'>
</FORM>
<?
$XML = "";
if (is_uploaded_file($_FILES['XML']['tmp_name'])) {
  $fp = fopen($_FILES['XML']['tmp_name'],"r") or DIE("Error opening XML");
  $XML .= fread($fp,filesize($_FILES['XML']['tmp_name']));
  fclose($fp);
} else {
  if ($_POST['folderfile'] != "") {
    $fp = fopen("../xml/".$_POST['folderfile'],"r") or DIE("Error opening XML");
    $XML .= fread($fp,filesize("../xml/".$_POST['folderfile']));
    fclose($fp);
  }
}
if ($XML != "") {
  echo "Time to read: ".round(getmicrotime()-$time,2)."<BR>";
  $time = getmicrotime();
  if (preg_match("/^<\?xml/i",$XML)) {
    $XML = preg_replace("/^<\?xml.+\?>/","",$XML);
  } else {
    die("Invalid XML format");
  }
  $tree = GetXMLTree($XML);
  unset($XML);
  echo "Time to array: ".round(getmicrotime()-$time,2)."<BR>";
  $time = getmicrotime();
  if (!$tree) {
    die("Invalid XML (Err: Error parseing XML)");
  }
  $path = array_shift($tree);
  array_shift($tree);
  array_shift($tree);
  array_shift($tree);
  foreach($tree as $key => $val) {
    if (!is_array($tree[$key])) {
      DIE("Invalid XML (Err: Loop page)");
    }
    parse_page($tree[$key], $path);
  }
  unset($tree);
  echo "Time to parse: ".round(getmicrotime()-$time,2)."<BR>";
}
?>
</BODY>
</HTML>

--EOF--

FUNC.PHP
--------
<?
function GetChildren($vals, &$i) {
  $children = array();
  if ($vals[$i]['value'])
    array_push($children, $vals[$i]['value']);
  while (++$i < count($vals)) {
    switch ($vals[$i]['type']) {
      case 'cdata':
        array_push($children, $vals[$i]['value']);
      break;
      
      case 'complete':
        $children[$vals[$i]['tag']] = $vals[$i]['value'];
      break;

      case 'open':
        $children[] = GetChildren($vals,$i);
      break;

      case 'close':
        return $children;
      break;
    }
  }
}

function GetXMLTree($XML) {
  echo "Creating parser... ";
  $p = xml_parser_create();
  echo "Setting enctype... ";
  xml_parser_set_option($p, XML_OPTION_TARGET_ENCODING, "UTF-8");
  echo "Parsing... ";
  xml_parse_into_struct($p, stripslashes($XML), $vals, $index);
  echo "Closing parser... ";
  xml_parser_free($p);
  unset($index);
  $i = 0;
  return GetChildren($vals, $i);
}

function getmicrotime() {
  list($usec, $sec) = explode(" ",microtime()); 
  return ((float)$usec + (float)$sec); 
}

function parse_page($page, $folder) {
  $file = $folder."\\".array_shift($page).".LDF<BR>";
  echo "<PRE>";
  print_r($page);
  echo "</PRE>";
}
?>

--EOF--

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2002-05-09 08:03 UTC] bbisgod at hotmail dot com
apologies, I withdraw my bug report.

The problem was 2 things
a) script time outs
b) invalid XML
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 26 16:01:29 2024 UTC