|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #40105 DOMDocument->createElement() unescapes numeric character references in values
Submitted: 2007-01-11 23:26 UTC Modified: 2007-01-19 06:17 UTC
From: fletch at pobox dot com Assigned:
Status: Not a bug Package: DOM XML related
PHP Version: 5.2.0 OS: linux
Private report: No CVE-ID: None
 [2007-01-11 23:26 UTC] fletch at pobox dot com
DOMDocument->createElement() unescapes any numeric character references (NCRs) contained in the value passed to its optional second parameter.  It should preserve the value as passed.

Reproduce code:
$dom = new DOMDocument( "1.0", 'UTF-8' );
$dom->appendChild( $dom->createElement( 'root', '&amp;&#0160;' ) );
var_dump( trim( $dom->saveXML() ) );

Expected result:
string(59) "<?xml version="1.0" encoding="UTF-8"?>

Actual result:
string(59) "<?xml version="1.0" encoding="UTF-8"?>
<root>&amp; </root>"


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2007-01-11 23:28 UTC] fletch at pobox dot com
There's a trivial error in the expected result text: the string length produced by var_dump() should be 65, not 59.
 [2007-01-12 07:44 UTC]
It was  decided to leave it as it is (for backwards 
compatibility reasons)

use $dom->createTextNode() for the behaviour you want.
 [2007-01-18 06:01 UTC] fletch at pobox dot com
I'm aware of the differences in escaping between createElement() and createTextNode() and neither one does what I describe.  createElement() unescapes NCRs (as I described in the original description) while createTextNode() escapes values.

In a nutshell, using createElement() I end up with "<root> </root>", and using createTextNode() I end up with "<root>&amp;#0160;</root>".

There is no way to generate the literal XML "<root>&#0160;</root>" using either createElement() OR createTextNode()
 [2007-01-18 11:05 UTC]
<root>&#0160;</root> and <root> </root> (the space here being 
a non breakable space) is the same in the XML context... It 
doesn't matter, if your write a character as numerical entity 
or with the actual char. It's the same. You have to live with 

And please don't just assign bugs...
 [2007-01-18 17:04 UTC] fletch at pobox dot com
Thanks for the quick response.  I did change the bug status back to "open" but I didn't assign it, did.

I believe you're right that this particular example both XML strings are the same, but only because the NCR I used represents a printable character.  A non-printable character must be represented as its NCR.  There's no way to craft a node whose value contains an NCR, and as such, no way to create an element whose value contains non-printable characters.
 [2007-01-19 06:17 UTC]
$dom = new DOMDocument( "1.0", 'UTF-8' );
$dom->appendChild( $dom->createElement( 'root','&#13;' ));
var_dump( trim( $dom->saveXML() ) );

prints out

string(57) "<?xml version="1.0" encoding="UTF-8"?>

Anyway, this is really not a PHP issue (if it is one at 
all), but a libxml2 issue. If you still think this is a bug, 
report it there

PHP Copyright © 2001-2017 The PHP Group
All rights reserved.
Last updated: Sun Nov 19 01:31:42 2017 UTC