php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #17114 xml_parse_into_struct takes too long, sometimes crashes IIS
Submitted: 2002-05-09 07:24 UTC Modified: 2005-09-22 21:43 UTC
From: bbisgod at hotmail dot com Assigned:
Status: Not a bug Package: XML related
PHP Version: 4.2.0 OS: Windows NT4
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: bbisgod at hotmail dot com
New email:
PHP Version: OS:

 

 [2002-05-09 07:24 UTC] bbisgod at hotmail dot com
Running PHP4.2.0 on IIS on NT4 server.

I am trying to parse large XML files with PHP.  When I run the code on a smaller 66kb file, it runs with the desired affects.  But when I try to run it on a large file (around 2mb), it takes its time doing xml_parse_into_struct.  Also, sometimes the PHP script can make the memory usage of IIS spiral uncontrollably, and the CPU usage is maxed out.

Please help, or tell me whats wrong

Thanks in advance

INDEX.PHP
---------
<?
include("func.php"); 
$time = getmicrotime();
?>
<HTML>
<BODY>
<FORM method=post action="index.php" enctype="multipart/form-data">
<INPUT type=file name="XML"><BR>
<SELECT size=10 name="folderfile">
<?
$handle=opendir("../xml/");
while ($file = readdir($handle)) {
  if ($file != "." && $file != "..") {
    echo "<option>$file";
  }
}
closedir($handle);
?>
</SELECT>
<INPUT type=submit value='Convert'>
</FORM>
<?
$XML = "";
if (is_uploaded_file($_FILES['XML']['tmp_name'])) {
  $fp = fopen($_FILES['XML']['tmp_name'],"r") or DIE("Error opening XML");
  $XML .= fread($fp,filesize($_FILES['XML']['tmp_name']));
  fclose($fp);
} else {
  if ($_POST['folderfile'] != "") {
    $fp = fopen("../xml/".$_POST['folderfile'],"r") or DIE("Error opening XML");
    $XML .= fread($fp,filesize("../xml/".$_POST['folderfile']));
    fclose($fp);
  }
}
if ($XML != "") {
  echo "Time to read: ".round(getmicrotime()-$time,2)."<BR>";
  $time = getmicrotime();
  if (preg_match("/^<\?xml/i",$XML)) {
    $XML = preg_replace("/^<\?xml.+\?>/","",$XML);
  } else {
    die("Invalid XML format");
  }
  $tree = GetXMLTree($XML);
  unset($XML);
  echo "Time to array: ".round(getmicrotime()-$time,2)."<BR>";
  $time = getmicrotime();
  if (!$tree) {
    die("Invalid XML (Err: Error parseing XML)");
  }
  $path = array_shift($tree);
  array_shift($tree);
  array_shift($tree);
  array_shift($tree);
  foreach($tree as $key => $val) {
    if (!is_array($tree[$key])) {
      DIE("Invalid XML (Err: Loop page)");
    }
    parse_page($tree[$key], $path);
  }
  unset($tree);
  echo "Time to parse: ".round(getmicrotime()-$time,2)."<BR>";
}
?>
</BODY>
</HTML>

--EOF--

FUNC.PHP
--------
<?
function GetChildren($vals, &$i) {
  $children = array();
  if ($vals[$i]['value'])
    array_push($children, $vals[$i]['value']);
  while (++$i < count($vals)) {
    switch ($vals[$i]['type']) {
      case 'cdata':
        array_push($children, $vals[$i]['value']);
      break;
      
      case 'complete':
        $children[$vals[$i]['tag']] = $vals[$i]['value'];
      break;

      case 'open':
        $children[] = GetChildren($vals,$i);
      break;

      case 'close':
        return $children;
      break;
    }
  }
}

function GetXMLTree($XML) {
  echo "Creating parser... ";
  $p = xml_parser_create();
  echo "Setting enctype... ";
  xml_parser_set_option($p, XML_OPTION_TARGET_ENCODING, "UTF-8");
  echo "Parsing... ";
  xml_parse_into_struct($p, stripslashes($XML), $vals, $index);
  echo "Closing parser... ";
  xml_parser_free($p);
  unset($index);
  $i = 0;
  return GetChildren($vals, $i);
}

function getmicrotime() {
  list($usec, $sec) = explode(" ",microtime()); 
  return ((float)$usec + (float)$sec); 
}

function parse_page($page, $folder) {
  $file = $folder."\\".array_shift($page).".LDF<BR>";
  echo "<PRE>";
  print_r($page);
  echo "</PRE>";
}
?>

--EOF--

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2002-05-09 08:03 UTC] bbisgod at hotmail dot com
apologies, I withdraw my bug report.

The problem was 2 things
a) script time outs
b) invalid XML
 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Tue Nov 24 07:01:24 2020 UTC