php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #43542 simpleXML thinks that comment is node
Submitted: 2007-12-09 11:02 UTC Modified: 2022-04-20 12:59 UTC
Votes:8
Avg. Score:4.8 ± 0.4
Reproduced:6 of 6 (100.0%)
Same Version:2 (33.3%)
Same OS:3 (50.0%)
From: 007not at gmail dot com Assigned:
Status: Open Package: SimpleXML related
PHP Version: 5.2.5 OS: *
Private report: No CVE-ID: None
 [2007-12-09 11:02 UTC] 007not at gmail dot com
Description:
------------
also see http://bugs.php.net/43392
>jani@php.net comment:
>This is just normal and expected behaviour.
><foo><!-- comment --></foo> is not same as <foo></foo>.
>(try var_dump($xml); to see what happens)

i made some new test for you, and try to var_dump() this
var_dump(array(/*'comment' => 'value'*/));
<foo><!-- comment --></foo> === <foo></foo> && array(/*'comment' => 'value'*/) === array('comment' => 'value')
is it still be same ? ;)

Reproduce code:
---------------
$string = <<<XML
<?xml version='1.0'?>
<document>
 <node><!-- comment --></node>
 <otherNode></otherNode>
 <comment>value</comment>
</document>
XML;
$xml = simplexml_load_string($string);

//note: xdebug used

//first test
var_dump($xml->node);
var_dump($xml->otherNode);

/*
Expected result:
----------------
null

object(SimpleXMLElement)[2]
  public 'comment' => string 'value' (length=5)

Actual result:
--------------
object(SimpleXMLElement)[2]
  public 'comment' =>
    object(SimpleXMLElement)[4]

object(SimpleXMLElement)[2]
  public 'comment' => string 'value' (length=5)
*/



//second test
$i = 0;
foreach ($xml->node as $node)
{
	$i++;
}
echo $i . "\n";

$i = 0;
foreach ($xml->otherNode as $node)
{
	$i++;
}
echo $i . "\n";

/*
Expected result:
----------------
0
1

Actual result:
--------------
1
1
*/

//third test
var_dump($xml->node->comment);
var_dump($xml->otherNode->comment);

//check magic
echo "node:\n";
if (is_object($xml->node->comment))
{
	echo "is_object === TRUE \n";
}
if (isset($xml->node->comment))
{
	echo "isset === TRUE \n";
}
//but
if (strlen($xml->node->comment) > 0)
{
	echo "strlen > 0\n";
}
if (strlen($xml->node->comment) == 0)
{
	echo "strlen == 0\n";
}

echo "otherNode:\n";
if (is_object($xml->otherNode->comment))
{
	echo "is_object === TRUE \n";
}
if (isset($xml->otherNode->comment))
{
	echo "isset === TRUE \n";
}

/*
Expected result:
----------------
node:
is_object === TRUE
isset === TRUE
strlen == 0
otherNode:
is_object === TRUE

Actual result:
--------------
node:
is_object === TRUE
strlen == 0
otherNode:
is_object === TRUE
*/

Expected result:
----------------
see code

Actual result:
--------------
see code

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2007-12-09 14:39 UTC] hubert dot roksor at gmail dot com
Regarding your first test, I wouldn't consider that a bug. var_dump() is a debugging tool, it may expose some of the behind-the-scene magic. Actually, that "comment" property might have been intentionally created as a way to indicate whether the node has a comment. That would explain isset()'s behaviour in your third test, but in this case I would recommand replacing that magical property with a method such as $node->hasComment(). I guess Rob Richards will be able to shed some light here.

As for your second test, I'm afraid it is incorrect: both $xml->node and $xml->otherNode should return 1 element and I don't see why having a comment as a child would change that.
 [2007-12-10 20:29 UTC] 007not at gmail dot com
hubert, you rigth about first test, i made mistake while i made posting of this bug.
the right firts test looks such:
//first test
echo "node:";
var_dump($xml->node);
echo "notExitsNode:";
var_dump($xml->notExitsNode);
echo "otherNode:";
var_dump($xml->otherNode);

Expected result:
----------------
node:
object(SimpleXMLElement)[2]
  public 'comment' => 
    object(SimpleXMLElement)[4]

notExitsNode:
null

otherNode:
object(SimpleXMLElement)[2]

Actual result:
--------------
node:
object(SimpleXMLElement)[2]
  public 'comment' => 
    object(SimpleXMLElement)[4]

notExitsNode:
object(SimpleXMLElement)[2]

otherNode:
object(SimpleXMLElement)[2]

--------------------------------------------------------

>Actually, that "comment" property might have been intentionally created
as a way to indicate whether the node has a comment.
But it is illogical, and we will have "comments hell" like:
array(/*'comment' => 'value'*/) === array('comment' => 'value')

--------------------------------------------------------
>As for your second test, I'm afraid it is incorrect
may be (because i tryed to count nodes, and it must be 1 1), but when we use arrays we have problems with fake comments
$i = 0;
foreach ((array) $xml->node as $node)
{
	$i++;
}
echo $i . "\n";

$i = 0;
foreach ((array) $xml->otherNode as $node)
{
	$i++;
}
echo $i . "\n";


Expected result:
----------------
0
0
Actual result:
--------------
1
0

#####################################################################
updated test:
<?php
$string = <<<XML
<?xml version='1.0'?>
<document>
 <node><!-- comment --></node>
 <otherNode></otherNode>
 <comment>value</comment>
</document>
XML;
$xml = simplexml_load_string($string);

//first test
echo "node:";
var_dump($xml->node);
echo "notExitsNode:";
var_dump($xml->notExitsNode);
echo "otherNode:";
var_dump($xml->otherNode);

/*
Expected result:
----------------
node:
object(SimpleXMLElement)[2]
  public 'comment' => 
    object(SimpleXMLElement)[4]

notExitsNode:
null

otherNode:
object(SimpleXMLElement)[2]

Actual result:
--------------
node:
object(SimpleXMLElement)[2]
  public 'comment' => 
    object(SimpleXMLElement)[4]

notExitsNode:
object(SimpleXMLElement)[2]

otherNode:
object(SimpleXMLElement)[2]
*/



//second test
$i = 0;
foreach ($xml->node as $node)
{
	$i++;
}
echo $i . "\n";

$i = 0;
foreach ($xml->otherNode as $node)
{
	$i++;
}
echo $i . "\n";

$i = 0;
foreach ((array) $xml->node as $node)
{
	$i++;
}
echo $i . "\n";

$i = 0;
foreach ((array) $xml->otherNode as $node)
{
	$i++;
}
echo $i . "\n";

/*
Expected result:
----------------
1
1
0
0

Actual result:
--------------
1
1
1
0
*/

//third test
var_dump($xml->node->comment);
var_dump($xml->otherNode->comment);

//check magic
echo "node:\n";
if (is_object($xml->node->comment))
{
	echo "is_object === TRUE \n";
}
if (isset($xml->node->comment))
{
	echo "isset === TRUE \n";
}
//but
if (strlen($xml->node->comment) > 0)
{
	echo "strlen > 0\n";
}
if (strlen($xml->node->comment) == 0)
{
	echo "strlen == 0\n";
}

echo "otherNode:\n";
if (is_object($xml->otherNode->comment))
{
	echo "is_object === TRUE \n";
}
if (isset($xml->otherNode->comment))
{
	echo "isset === TRUE \n";
}

/*
Expected result:
----------------
node:
is_object === TRUE
isset === TRUE
strlen == 0
otherNode:
is_object === TRUE

Actual result:
--------------
node:
is_object === TRUE
strlen == 0
otherNode:
is_object === TRUE
*/
 [2007-12-20 14:39 UTC] helly@php.net
The comment is a node. What we actually need is a way to figure out the xml type of a SimpleXMLElement instance (Element, Comment,...). This will also have to return the internal SXE state (iterator for something or direct value).
 [2007-12-23 20:07 UTC] 007NOT at gmail dot com
>The comment is a node. 
comment isn't node, it's nothing
exsampe:
<node></node> - node with empty value
<node><!-- comment --></node> - node with empty value
<node><!-- comment --> </node> - but not empty node thats has value "space"

in :
>$i = 0;
>foreach ((array) $xml->node as $node)
>{
>	$i++;
>}
>echo $i . "\n";
he we checks child nodes, but not checks that node exists

see one more test:
<?php
$string = <<<XML
<?xml version='1.0'?>
<document>
 <node><!-- comment --></node>
 <otherNode></otherNode>
 <comment>value</comment>
</document>
XML;
$xml = simplexml_load_string($string);

$i = 0;
foreach ((array) $xml->node as $node)
{
	$i++;
}
echo $i . "\n";

$dom= new DOMDocument();
$dom->loadXML($string);
$Xpath = new domxpath($dom);
$NodeList = $Xpath->query('/document');
foreach ($NodeList as $Node)
{
	foreach ($Node->childNodes as $cNode)
	{
	echo 'Name: ' . $cNode->nodeName . ' Value: ' . $cNode->nodeValue . PHP_EOL;
	if ($cNode->childNodes->length > 1) echo 'Has childNodes' . PHP_EOL;
	}
}

>What we actually need is a way to figure out the xml type of a SimpleXMLElement instance (Element, Comment,...). 
may be you make new metod
 [2008-01-11 17:51 UTC] 007NOT at gmail dot com
helly wrote:
>The comment is a node.
SimpleXMLElement->children() do not think so !!! where is the TRUE ?

exsample:
<?php
$string = <<<XML
<?xml version='1.0'?>
<document>
 <node>
  <!-- comment -->
  <a />
 </node>
</document>
XML;
$xml = simplexml_load_string($string);

var_dump($xml->node);
$i = 0;
foreach ($xml->node as $node)
{
	$i++;
}
echo $i . "\n";

var_dump((array) $xml->node);
$i = 0;
foreach ((array) $xml->node as $node)
{
	$i++;
}
echo $i . "\n";

var_dump($xml->node->children());
$i = 0;
foreach ((array) $xml->node->children() as $node)
{
	$i++;
}
echo $i . "\n";
 [2009-06-13 06:57 UTC] jbeauwalker at gmail dot com
It seems that the individual who reported this does not like this behaviour but I actually want access to comments. [I'm managing Google Earth kml files and...well the detailed explanation is lengthy]. 

Unfortunately the current behaviour doesn't seem to satisfy either of us because while comments appear as nodes I can't seem to get access to their content. At the risk of disturbing the peace of the purists, might I suggest something along the lines that if a comment is placed and formatted like a node on a separate line -- eg:

    <node>
        <!-- comment -->
        <a />
    </node>
That it appear as a node (->node->comment) with a value with accessible content, but if placed on another line it is simply ignored. 

        <node>   <!-- this is ignored -->

I can hear the complaints already but what I'm really trying to suggest is some solution with a bit of imagination where a comment can be identified as such and its value obtained.

   <node <!--
 [2011-04-08 21:08 UTC] jani@php.net
-Package: Feature/Change Request +Package: SimpleXML related
 [2011-09-28 13:40 UTC] aapocketz at gmail dot com
I see that my $xml simplexml object I can see the comments blocks but how do I 
access and read them?  Can I modify/insert comments into the xml somehow?  

this is a print_r section:

[comment] => Array
        (
            [0] => SimpleXMLElement Object
                (
                )

            [1] => SimpleXMLElement Object
                (
                )

            [2] => SimpleXMLElement Object
                (
                )

            [3] => SimpleXMLElement Object
                (
                )

            [4] => SimpleXMLElement Object
                (
                )

            [5] => SimpleXMLElement Object
                (
                )

            [6] => SimpleXMLElement Object
                (
                )

            [7] => SimpleXMLElement Object
                (
                )
 [2014-03-11 18:06 UTC] nick dot strupat at gmail dot com
The best solution I can think of is to put comments into the pseudo-array returned by SimpleXMLElement::children() with the same convention as attributes (with the key "@attributes"). Comments should be represented by the key "@comments" in the parent's children.
 [2022-04-20 12:59 UTC] cmb@php.net
> simpleXML thinks that comment is node

Well, it depends, and this looks wrong to me.  Consider the given
XML in the OP.  Then

    var_dump($xml->node->count());           // int(1)
    var_dump(count($xml->node->children())); // int(0)

So if we count the children of an element, we get 1, but if we
count the children of an element, we get 0.  At the very least,
the documentation would need to be clarified; although to me the
current behavior is clearly a bug: either comments nodes are
(accessible) children, or not.

> At the risk of disturbing the peace of the purists, might I
> suggest something along the lines that if a comment is placed and
> formatted like a node on a separate line

I'm strongly against making the behavior depending on the XML
formatting.

> The best solution I can think of is to put comments into the
> pseudo-array returned by SimpleXMLElement::children() with the
> same convention as attributes (with the key "@attributes").

That wouldn't work in all cases.  Consider

  <root>
   <!-- 1st comment -->
   <foo/>
   <!-- 2nd comment -->
  </root>

It might make sense to introduce a new option which allows to
define whether comment nodes should be ignored or not.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Oct 08 12:01:26 2024 UTC