|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #16952 Bug when parsing XML file using XML_Tree
Submitted: 2002-05-01 16:57 UTC Modified: 2002-05-17 15:58 UTC
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: Assigned: cox (profile)
Status: Closed Package: PEAR related
PHP Version: 4.2.0 OS: RH Linux 6.1
Private report: No CVE-ID: None
 [2002-05-01 16:57 UTC]
Was testing XML_Tree parsing some files from the PHPDOC directory (the PHP Documentation in, and found out a really nasty bug that makes this class less than useful.

It seems like XML_Tree (or XML_Parser, I had not isolated the problem), just misses content when parsing the files. I used the following code:

=== testTree.php ===

require_once "XML/Tree.php";

// change to wherever you have checked out phpdoc

$phpdoc = "/home/jesus/devel/php/phpdoc/";

$entities = "entities/global.ent";

$file = "en/language/control-structures.xml";

$xmlarr = file($phpdoc.$file);

$entarr = file($phpdoc.$entities);

$doctype = '<!DOCTYPE chapter PUBLIC "-//Norman Walsh//DTD DocBk XML V3.1.4/EN"

				"" [


$xmltop = $xmlarr[0];

$xmlbody = implode("",array_slice($xmlarr,1));

$xmldoc = $xmltop.$doctype.$xmlbody;

//echo $xmldoc;

$tree = new XML_Tree();

$root = $tree->getTreeFromString($xmldoc);



=== end ===

And when I run it, I get bits like (when commenting out $root->dump()):

=== $root->dump() ===


<sect1 id="function.include-once">







    The include_once</function>




=== end ===

When echoing $xmldoc above, that same section looks like:

=== echo $xmldoc; ===


 <sect1 id="function.include-once">



    The <function>include_once</function> statement includes and evaluates

    the specified file during the execution of the script.

    This is a behavior similar to the <function>include</function> statement,

    with the only difference being that if the code from a file has already

    been included, it will not be included again.  As the name suggests, 

    it will be included just once.



=== end ===

Clearly something went terribly wrong while parsing or constructing the tree. Adding the code below assured me that the xml parsing lib (expat) was not the one messing up:

=== extra code ===

$xml = xml_parser_create();

xml_parse_into_struct($xml, $xmldoc, &$vals, &$index);



=== end ===

That bit of code got the content OK. I think that a thorough revision of XML_Tree and/or XML_Parser is warranted.


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2002-05-01 17:01 UTC]
BTW, I am using the latest version of XML_Tree from the CVS tree (pear/XML_Tree), and the XML_Parser included w/ PHP 4.2.0
 [2002-05-07 12:46 UTC]
The current implementation I did does not support mixed XML contents (#PCDATA + elements). If someone could tell me how DOM represents that, I could try to fix this issue.

This will be enought:
$xmlstr = "
This is a <tag bar="a">foo</tag> mixed example


Tomas V.V.Cox
 [2002-05-13 22:41 UTC]
I am not sure about how libdomxml does the transversal, but here is the spec from W3C for transversal:

 [2002-05-17 08:20 UTC]
Ok, after some research time and a headache, I commited a fix for that. Please try it and tell me if the support is now correct.

-- Tomas V.V.Cox
 [2002-05-17 15:58 UTC]
The class now works as expected, and it dos not choke on <![CDATA[]]>, although it loses it when dump()'ing to a string, but that is a minor thing. 

Good work Tomas!
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Apr 16 12:01:29 2024 UTC