php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #61972 addchild treats text as a tag
Submitted: 2012-05-07 21:09 UTC Modified: 2018-08-07 15:20 UTC
From: crashyn at op dot pl Assigned:
Status: Analyzed Package: SimpleXML related
PHP Version: 5.4.2 OS: Windows XP
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2012-05-07 21:09 UTC] crashyn at op dot pl
Description:
------------
---
From manual page: http://www.php.net/simplexmlelement.addchild
---
addChild treats <my_string> as a tag nd removes it completely


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2012-05-16 20:55 UTC] riptide dot tempora at opinehub dot com
Can you provide a test script and its actual vs. expected output to show exactly what you mean?
 [2012-05-20 21:54 UTC] crashyn at op dot pl
<?php
$xml_header = "<?xml version='1.0' encoding='utf-8'?><xml/>";

$xml = new SimpleXMLElement($xml_header);
$xml->addChild("first_string","this is &lt;mystring&gt;");
$xml->addChild("second_string","this is &lt; mystring&gt;");
$xml->asXML("test.xml");
echo "<pre>" . $xml->first_string . "<br />";	// 'this is '
echo $xml->second_string . "</pre>";			// 'this is < mystring>'
?>
 [2012-05-25 11:54 UTC] sjon at hortensius dot net
Shouldn't the values passed to xmlNewChild in addChild go through BAD_CAST like 
all other Xml related methods do?
 [2012-05-25 12:04 UTC] arjen at react dot com
Something is wrong here.
The tag is not removed, it's not encoded anymore. But &entity; are removed.
See  http://3v4l.org/EJGuL
 [2018-08-07 15:20 UTC] cmb@php.net
-Status: Open +Status: Analyzed -Type: Feature/Change Request +Type: Documentation Problem
 [2018-08-07 15:20 UTC] cmb@php.net
> Something is wrong here.

No, everything works as expected, albeit the behavior is not
documented.  Firstly, I doubt that the tag has ever been removed
as reported originally; almost certainly the tag was simply not
shown by the browser.

Anyhow, SimpleXMLElement::addChild() uses xmlNewChild() under the
hood, and the relevant documentation[1] states:

| If @content is non NULL, a child list containing the TEXTs and
| ENTITY_REFs node will be created. NOTE: @content is supposed to be
| a piece of XML CDATA, so it allows entity references. XML special
| chars must be escaped first by using xmlEncodeEntitiesReentrant(),
| or xmlNewTextChild() should be used.

When SimpleXMLElement::__toString() is called, it uses
xmlNodeListGetString() under the hood, whose documentation[2]
states:

| Build the string equivalent to the text contained in the Node
| list made of TEXTs and ENTITY_REFs

So *known* entity refs are resolved, while unresolvable entity
refs are skipped (&euro; is not predefined).

Note that creating empty child nodes, and setting their value
afterwards via assignment has different results[3], because the
assignment applies xmlEncodeEntitiesReentrant() automatically.

Changing to documentation issue.

[1] <http://www.xmlsoft.org/html/libxml-tree.html#xmlNewChild>
[2] <http://www.xmlsoft.org/html/libxml-tree.html#xmlNodeListGetString>
[3] <https://3v4l.org/JMd3W>
 
PHP Copyright © 2001-2019 The PHP Group
All rights reserved.
Last updated: Tue May 21 00:01:27 2019 UTC