php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #26188 Section CDATA is not recognized by the parser
Submitted: 2003-11-10 02:57 UTC Modified: 2003-11-10 07:27 UTC
From: sergey at bds dot ru Assigned:
Status: Not a bug Package: XML related
PHP Version: 4.3.2 OS: Windows
Private report: No CVE-ID: None
 [2003-11-10 02:57 UTC] sergey at bds dot ru
Description:
------------
When parse_xml_into_structure parses an xml, it seems for me that it skips the CDATA section. 
Parser assumes the CDATA node as an usual value of its node.
But, it should be actually parsed into a new child node - CDATA section

Reproduce code:
---------------
test.xml:
<?xml version="1.0" encoding="UTF-8" ?>
<PlaceHolderList>
<![CDATA[
function fnShow()
{

}
]]>
</PlaceHolderList>

test.php
$strXml      = implode("", file("test.xml"));
$objParser   = xml_parser_create();
xml_parser_set_option($objParser, XML_OPTION_CASE_FOLDING, 0);
xml_parser_set_option($objParser, XML_OPTION_SKIP_WHITE,   1);
xml_parse_into_struct($objParser, $strXml, $arrParserValues, $arrParserIndexes);
print_r($arrParserValues);

Expected result:
----------------
Array
(
    [0] => Array
        (
            [tag] => PlaceHolderList
            [type] => complete
            [level] => 1
            [value] => 
    [1] => Array
        (
            [tag] => PlaceHolderList
            [type] => cdata
            [level] => 2
            [value] => function fnShow(){    var aAll = document.all;    var e = new Enumerator(aAll);    while (!e.atEnd())    {>        alert (e.item());        e.moveNext();    }}
        )

)

Actual result:
--------------
Array
(
    [0] => Array
        (
            [tag] => PlaceHolderList
            [type] => complete
            [level] => 1
            [value] => function fnShow(){    var aAll = document.all;    var e = new Enumerator(aAll);    while (!e.atEnd())    {>        alert (e.item());        e.moveNext();    }}
        )

)

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2003-11-10 03:04 UTC] sergey at bds dot ru
In a "Expected result" section the second array with the supposed CDATA Section should have value of the "tag" attribute "#cdata-section"
 [2003-11-10 06:36 UTC] moriyoshi@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

The CDATA section and the ordinary text node are no different semantically... How come you think the CDATA section should be represented as a separate node...

 [2003-11-10 07:03 UTC] sergey at bds dot ru
Ok, I have got what you mean.

1) I don't really know standart round the cdata section, but MSXML
appears to handle such a sections as a nodes...
so the structure
<Root>
<![CDATA[
function fnShow()
{

}
]]>
</Root>

will produce
Root [XMLElement] (Node that has a child)
  CDATA [CDATA Section] (Node of the type CDATA)

2) How can you think the operation of parsing/serializing XML
should not be the same?

If we'll parse with xml_parse_into_structure xml listed below, we
will not know after the parsing that it was really CDATA section!
So we can not restore xml!

This is not right by my opininon.

3) I am writing an wrapper for xml_parse..., a module like you have
under the name DOMXML functions.
So i really was surprized about the situation after i laid 5 days out
for writing an MSXML-like interface... And it is a problem...
don't know what to do next...
 [2003-11-10 07:27 UTC] moriyoshi@php.net
Such operations are just irreversible because the behaviour is quite dependent on the backend XML parsing implementation. Not a PHP developer issue.

 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Tue Nov 24 14:01:23 2020 UTC