Bug #21213 invalid entities handling into set_attribute() and set_content()
Submitted: 2002-12-27 10:05 UTC Modified: 2003-04-09 06:36 UTC
From: flying at dom dot natm dot ru Assigned:
Status: No Feedback Package: DOM XML related
PHP Version: 4.3.0RC4 OS: All
Private report: No CVE-ID: None
 [2002-12-27 10:05 UTC] flying at dom dot natm dot ru
Please take a look at following example:
$xml = domxml_open_mem('<'.'?xml version="1.0"?'.'><root/>');
$root = $xml->root();
$value = $root->set_attribute('a','a&amp;b');
$value = $root->set_content('a&amp;b');
echo $xml->dump_mem();

 It produces following results:

<?xml version="1.0"?>
<root a="a&amp;amp;b">a&amp;b</root>

 As you may see - &amp; entity is treated as literals when it is being set as attribute value while same entity is treated as entity reference being set as node value. 
 I have checked PHP's DOMXML extension source, libxml2 sources and discuss about this behaviour with Daniel Viellard (libxml2 maintainer) and with some other people on public XML-related forums and here is some information about this issue:

1. Such behaviour is not a libxml2 bug, it is expected behaviour. Moreover it is more correct from a point of specifications. 
2. There should be a way to access Attr DOM object as specified into DOM Level 1 specification
3. There should be a way to control entites handling into passed values. 

 As a way to go i want to propose you to add one additional argument to set_attribute(), set_content() and maybe some other functions - $options.
 For now there will be 2 options:
XML_KEEP_ENTITIES - to treat all entities as entites and create them as EntityReference DOM objects 
XML_QUOTE_ENTITIES - to treat all entities as literals and hence quote all special symbols in them (such as '&' char).

For compatibility reasons $options for set_attribute() may be set to XML_QUOTE_ENTITIES as default value and $options for set_content() - for XML_KEEP_ENTITIES.

 Internally you probably should change xmlSetProp() call into domxml_elem_set_attribute() to xmlNodeSetContentLen() when there is $options=XML_KEEP_ENTITIES.


 [2003-01-01 14:59 UTC]

I will most certainly not add this options, since i prefer to stick to the w3c standard at

And in my interpretation of this, php's domxml behaves correctly. The only thing missing is setAttributeNode, which I maybe will implement, if I find some time.

If there is setAttributeNode, then you can use the way the w3c suggests.

 [2003-01-03 04:01 UTC] flying at dom dot natm dot ru
Yes, i agree with you in this point, but it also means, that you should provide a way to parse given text value and build NodeList of Text and EntityReference nodes. libxml2 already have such function.
 [2003-01-03 09:35 UTC]
As said before, we need set_attribute_node($attrNode) and append_child() et al. working in attribute Nodes, then it should work. Don't know, when I have the time to do it, if someone else wants to take over this part, feel free ;)

 [2003-04-03 09:49 UTC]
Please try using this CVS snapshot:
For Windows:

set_attribute_node is now in CVS, can you check it, if it's now possible, what you intend to do?

 [2003-04-09 06:36 UTC]
No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Open". Thank you.

