php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #40105 DOMDocument->createElement() unescapes numeric character references in values
Submitted: 2007-01-11 23:26 UTC Modified: 2007-01-19 06:17 UTC
From: fletch at pobox dot com Assigned:
Status: Not a bug Package: DOM XML related
PHP Version: 5.2.0 OS: linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: fletch at pobox dot com
New email:
PHP Version: OS:

 

 [2007-01-11 23:26 UTC] fletch at pobox dot com
Description:
------------
DOMDocument->createElement() unescapes any numeric character references (NCRs) contained in the value passed to its optional second parameter.  It should preserve the value as passed.

Reproduce code:
---------------
<?php
$dom = new DOMDocument( "1.0", 'UTF-8' );
$dom->appendChild( $dom->createElement( 'root', '&amp;&#0160;' ) );
var_dump( trim( $dom->saveXML() ) );
?>

Expected result:
----------------
string(59) "<?xml version="1.0" encoding="UTF-8"?>
<root>&amp;&#0160;</root>"

Actual result:
--------------
string(59) "<?xml version="1.0" encoding="UTF-8"?>
<root>&amp; </root>"

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2007-01-11 23:28 UTC] fletch at pobox dot com
There's a trivial error in the expected result text: the string length produced by var_dump() should be 65, not 59.
 [2007-01-12 07:44 UTC] chregu@php.net
It was  decided to leave it as it is (for backwards 
compatibility reasons)

use $dom->createTextNode() for the behaviour you want.
 [2007-01-18 06:01 UTC] fletch at pobox dot com
I'm aware of the differences in escaping between createElement() and createTextNode() and neither one does what I describe.  createElement() unescapes NCRs (as I described in the original description) while createTextNode() escapes values.

In a nutshell, using createElement() I end up with "<root> </root>", and using createTextNode() I end up with "<root>&amp;#0160;</root>".

There is no way to generate the literal XML "<root>&#0160;</root>" using either createElement() OR createTextNode()
 [2007-01-18 11:05 UTC] chregu@php.net
<root>&#0160;</root> and <root> </root> (the space here being 
a non breakable space) is the same in the XML context... It 
doesn't matter, if your write a character as numerical entity 
or with the actual char. It's the same. You have to live with 
that.

And please don't just assign bugs...
 [2007-01-18 17:04 UTC] fletch at pobox dot com
Thanks for the quick response.  I did change the bug status back to "open" but I didn't assign it, tony2001@php.net did.

I believe you're right that this particular example both XML strings are the same, but only because the NCR I used represents a printable character.  A non-printable character must be represented as its NCR.  There's no way to craft a node whose value contains an NCR, and as such, no way to create an element whose value contains non-printable characters.
 [2007-01-19 06:17 UTC] chregu@php.net
$dom = new DOMDocument( "1.0", 'UTF-8' );
$dom->appendChild( $dom->createElement( 'root','&#13;' ));
var_dump( trim( $dom->saveXML() ) );

prints out

string(57) "<?xml version="1.0" encoding="UTF-8"?>
<root>&#13;</root>"


Anyway, this is really not a PHP issue (if it is one at 
all), but a libxml2 issue. If you still think this is a bug, 
report it there



 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Nov 23 02:01:31 2024 UTC