php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #24002 Nested calls to xml_parse no longer work
Submitted: 2003-06-03 16:14 UTC Modified: 2003-09-23 16:44 UTC
Votes:2
Avg. Score:4.5 ± 0.5
Reproduced:2 of 2 (100.0%)
Same Version:2 (100.0%)
Same OS:1 (50.0%)
From: derek at hostopia dot com Assigned:
Status: Not a bug Package: XML related
PHP Version: 4.3.3RC5-dev, 5.0.0b2-dev OS: Linux
Private report: No CVE-ID: None
 [2003-06-03 16:14 UTC] derek at hostopia dot com
PHP versions up to and including 4.2.2 supported calling xml_parse from within an xml element/data handler, but when tested with version 4.3.2, this functionality produces unexpected results.

Sometimes the error 'xml processing instruction not at start of external entity' occurs, but most of the time the xml parser will get stuck in an endless loop.

A rather massive PHP application makes use of this feature, and we currently do not have a work-around.

Basically we need XML elements to be able to give dynamic XML content to the XML parser.

This was working fine up until now, and is quite important.

Is there a "better way" to accomplish this if in fact this use of xml_parse is unorthodox?

For example, this XML-based code:

<SCREEN>
  <INFO>This will render a random surprise shape</INFO>
  <RANDOM shapes="SQUARE, TRIANGLE, CIRCLE"/>
</SCREEN>

Where the element handler for "RANDOM" will give a random XML element to the parser... i.e. <SQUARE width="5" height="5"/>

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2003-06-03 22:16 UTC] sniper@php.net
Please provide a short but complete example script.

 [2003-06-04 13:43 UTC] derek at hostopia dot com
Here as requested is an example which works fine under 4.2.2, and causes an endless loop in 4.3.2:


<!-- BEGIN XML FILE:  shapes.xml -->

<SCREEN>
  <INFO>This will render a random surprise shape</INFO>
  <RANDOM shapes="SQUARE TRIANGLE CIRCLE"/>
</SCREEN>

<!-- END XML FILE -->


### CUT HERE ###


<!-- BEGIN PHP FILE:  shapes.php -->

<?php
$file = "shapes.xml";

function startElement($parser, $name, $attribs) {
    switch ($name)
    {
        case "RANDOM":
            $list = explode(" ", $attribs["SHAPES"]);
            $num = count($list);
            $rnd = rand(1, $num) - 1;
            $xml = "<" . $list[$rnd] . "/>";
            if ( !xml_parse($parser, $xml) )
            {
                print xml_error_string(xml_get_error_code($parser));
            }
            break;
        case "SQUARE":
            print "\n       ################\n";
            print "       ################\n";
            print "       ################\n";
            print "       ################\n";
            print "       ################\n";
            print "       ################\n";
            print "       ################\n";
            print "       ################\n";
            break;
        case "TRIANGLE":
            print "\n              ##       \n";
            print "             ####      \n";
            print "            ######     \n";
            print "           ########    \n";
            print "          ##########   \n";
            print "         ############  \n";
            print "        ############## \n";
            print "       ################\n";
            break;
        case "CIRCLE":
            print "\n           ########    \n";
            print "         ############  \n";
            print "        ############## \n";
            print "        ############## \n";
            print "        ############## \n";
            print "        ############## \n";
            print "         ############  \n";
            print "           ########    \n";
            break;
    }
}
function endElement($parser, $name) {
}

function characterData($parser, $data) {
    print "<b>$data</b>";
}

function defaultHandler($parser, $data) {
    if (substr($data, 0, 1) == "&" && substr($data, -1, 1) == ";") {
        printf('<font color="#aa00aa">%s</font>',
            htmlspecialchars($data));
    } else {
        printf('<font size="-1">%s</font>',
            htmlspecialchars($data));
    }
}

function new_xml_parser($file) {
    global $parser_file;

    $xml_parser = xml_parser_create();
    xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, 1);
    xml_set_element_handler($xml_parser, "startElement", "endElement");
    xml_set_character_data_handler($xml_parser, "characterData");
    xml_set_default_handler($xml_parser, "defaultHandler");

    if (!($fp = @fopen($file, "r"))) {
        return false;
    }
    if (!is_array($parser_file)) {
        settype($parser_file, "array");
    }
    $parser_file[$xml_parser] = $file;
    return array($xml_parser, $fp);
}

if (!(list($xml_parser, $fp) = new_xml_parser($file))) {
    die("could not open XML input");
}

print "<pre>";
while ($data = fread($fp, 4096)) {
    if (!xml_parse($xml_parser, $data, feof($fp))) {
        die(sprintf("XML error: %s at line %d\n",
            xml_error_string(xml_get_error_code($xml_parser)),
            xml_get_current_line_number($xml_parser)));
    }
}
print "</pre>";
print "parse complete\n";
xml_parser_free($xml_parser);

?>

<!-- END PHP FILE -->
 [2003-06-15 22:35 UTC] sniper@php.net
Works fine with PHP 4.2.3, breaks with 4.3.1, 4.3.2, 4.3.3-dev.

 [2003-06-15 23:08 UTC] sniper@php.net
It also crashes:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1024 (runnable)]
0x40678cd9 in __strtod_internal (nptr=0x8ad63ec "SCREEN", endptr=0xbfe0225c, group=0) at strtod.c:419
(gdb) bt
#0  0x40678cd9 in __strtod_internal (nptr=0x8ad63ec "SCREEN", endptr=0xbfe0225c, group=0) at strtod.c:419
#1  0x4067dc59 in strtod (nptr=0x8ad63ec "SCREEN", endptr=0xbfe0225c) at strtod.c:1425
#2  0x82c5345 in is_numeric_string (str=0x8ad63ec "SCREEN", length=6, lval=0xbfe022c8, dval=0xbfe022bc, 
    allow_errors=0 '\000') at /usr/src/web/php/php4/Zend/zend_operators.h:94
#3  0x82c4ebe in zendi_smart_strcmp (result=0xbfe0242c, s1=0x8ad63ac, s2=0x88a7764)
    at /usr/src/web/php/php4/Zend/zend_operators.c:1670
#4  0x82c3736 in compare_function (result=0xbfe0242c, op1=0x8ad63ac, op2=0x88a7764)
    at /usr/src/web/php/php4/Zend/zend_operators.c:1137
#5  0x82c41a6 in is_equal_function (result=0xbfe0242c, op1=0x8ad63ac, op2=0x88a7764)
    at /usr/src/web/php/php4/Zend/zend_operators.c:1285
#6  0x82dc798 in execute (op_array=0x88a60c8) at /usr/src/web/php/php4/Zend/zend_execute.c:1931
#7  0x82bc741 in call_user_function_ex (function_table=0x85a7cb0, object_pp=0x0, function_name=0x88a1744, 
    retval_ptr_ptr=0xbfe02c44, param_count=3, params=0x8ad6554, no_separation=1, symbol_table=0x0)
    at /usr/src/web/php/php4/Zend/zend_execute_API.c:566
#8  0x82bbee7 in call_user_function (function_table=0x85a7cb0, object_pp=0x88a0a58, function_name=0x88a1744, 
    retval_ptr=0x8ad6514, param_count=3, params=0xbfe02cdc) at /usr/src/web/php/php4/Zend/zend_execute_API.c:408
#9  0x8261550 in xml_call_handler (parser=0x88a0a1c, handler=0x88a1744, argc=3, argv=0xbfe02cdc)
    at /usr/src/web/php/php4/ext/xml/xml.c:377
#10 0x826207c in _xml_startElementHandler (userData=0x88a0a1c, name=0x8ad6326 "SCREEN", attributes=0x88a0d08)
    at /usr/src/web/php/php4/ext/xml/xml.c:661

Diff betweeb 4.2.3 and 4.3.3-dev ext/xml doesn't give any significant changes, so it must be something else that has changed and just hasn't been changed also in ext/xml, call_user_function() maybe?

 [2003-09-23 15:08 UTC] rrichards@php.net
xml_parse cant be used like this as you already found out: http://mail.libexpat.org/pipermail/expat-discuss/2003-June/001039.html
 [2003-09-23 15:23 UTC] derek at hostopia dot com
True, but that doesn't explain why it worked with all versions of PHP prior to 4.3.X, and then stopped working ;)

We've worked around this "issue", but it has definately added a little overhead to the script.

I still think the XML parser should be more robust and dynamic.
 [2003-09-23 16:44 UTC] rrichards@php.net
If it worked then you were lucky. Running with the bundled expat library, the behavior I see when testing is consistent in 4.1.2, 4.2.2 and 4.3.x (infinite loop), which is why they said not to do that and that it wont work.

As far as the parser robustness and dynamic capaibilities, you have to take it up with the expat developers, but I think they already gave you their answer to that one.

 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Thu Nov 26 01:01:23 2020 UTC