|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2006-03-20 02:05 UTC] john at carney dot id dot au
Description:
------------
While it is not specifically mentioned in the documentation, DOMElement->setAttribute automatically escapes XML special characters in the value parameter. Yet, as of PHP 5.1.2 it will throw an "unterminated entity reference" warning if the supplied value contains an ampersand - even if it is escaped.
As well as fixing the actual bug, the documentation needs to clarify *exactly* how special characters in the inputs to this and other DOM functions are treated. If you are going to silently escape input text, you need to tell people so that they don't end up with stuff being double-escaped.
Reproduce code:
---------------
$element->setAttribute ("anattr", "jack & jill") ;
$element->setAttribute ("anattr", "jack & jill") ;
Expected result:
----------------
No warnings should be thrown.
Actual result:
--------------
BOTH calls to setAttribute throw an "unterminated entity reference" warning.
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Fri Oct 24 11:00:02 2025 UTC |
Hi, I have the same problem. My config is : - PHP 5.2 - libxml Version 2.6.16 --------- <?php $xmlStr = "<?xml version='1.0' encoding='UTF-8'?><root></root>"; $xml = new SimpleXMLElement($xmlStr); $xml->addChild("foo",utf8_encode("start < > end")); echo "foo tag added ok"; $xml->addChild("bar",utf8_encode("start & end")); echo "error on bar tag because of &"; $result = $xml->asXML(); echo "<pre>".htmlentities($result)."</pre>"; ?> ------------- you can run this script at : http://www.kitpages.fr/test/bugSimpleXml.phpI tried the workaround below and it seems to work: $xml->addChild('element', ''); $xml->element = str_replace("&", "&", "value of the element");PHP 5.2.4 Looks like the problem appears when there's node already exists being overwritten // works ok, doesn't require encoding: $a = simplexml_load_string('<a/>'); $a->b = "& < ' "; // doesn't work, requires encoding: $a = simplexml_load_string('<a><b>test</b></a>'); $a->b = "& < ' "; // doesn't work, always requires encoding $a->addChild('b', "& < '"); $a->addAttribute('b', "& < '"); // works ok, never requires encoding $a['b'] = "& < '";A little hack to get around this bug: function &safe_add_child(&$sxml, $name, $value) { $safe_value = preg_replace('/&(?!\w+;)/', '&', $value); return $sxml->addChild($name, $safe_value); }I'm running PHP 5.2.9 on Linux and this bug is still alive and well making SimpleXml absolutely inappropriate for XML communications between systems. <code> $safe_value = preg_replace('/&(?!\w+;)/', '&', $value); return $sxml->addChild($name, $safe_value); </code> Is just plain wrong. I'm communicating user input directly to a bank as I can't know how the third party will parse their xml.Still seeing this issue... $order_x->addChild('location', '1st & 52nd'); gives "Warning: SimpleXMLElement::addChild(): unterminated entity reference" If I run it as $order_x->addChild('location', htmlspecialchars('1st & 52nd')); I have no problems.