php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #45824 loadXML moves entity references from attribute value to before element
Submitted: 2008-08-14 17:59 UTC Modified: 2008-08-16 02:38 UTC
From: marc at mongenet dot ch Assigned:
Status: Not a bug Package: DOM XML related
PHP Version: 5.2.6 OS: Linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: marc at mongenet dot ch
New email:
PHP Version: OS:

 

 [2008-08-14 17:59 UTC] marc at mongenet dot ch
Description:
------------
When an attribute value contains an entity reference (like a="é"), loadXML() moves this entity out of the attribute, just before the owner element.

Reproduce code:
---------------
$doc = new DOMDocument();
$xml = '<!DOCTYPE e PUBLIC "1" "2">'."\n". # DOCTYPE just to appear well-formed
       '<e><e a="&eacute;"/></e>'."\n";
$doc->loadXML($xml);
echo '<pre>', htmlspecialchars($doc->saveXML()), '</pre>';


Expected result:
----------------
<?xml version="1.0"?>
<!DOCTYPE e PUBLIC "1" "2">
<e><e a="&eacute;"/></e>


Actual result:
--------------
<?xml version="1.0"?>
<!DOCTYPE e PUBLIC "1" "2">
<e>&eacute;<e a=""/></e>


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-08-15 06:34 UTC] chregu@php.net
You should read the warning, it produces:

Warning: DOMDocument::loadXML(): Entity 'eacute' not defined in Entity, 
line: 2 in /Users/chregu/tmp/foo.php on line 6

eacute is not a entity which is defined by default (there are only 5 of 
them)
 [2008-08-16 02:38 UTC] marc at mongenet dot ch
In the Extensible Markup Language (XML) 1.0 (Fourth Edition) W3C Recommendation, chapter 4.1 Character and Entity References, it is written:
"Note that non-validating processors are not obligated to to read and process entity declarations occurring in parameter entities or in the external subset; for such documents, the rule that an entity must be declared is a well-formedness constraint only if standalone='yes'."

What we have in this example is : "Entity 'eacute' is not defined in Entity" -> i.e. it is not defined in the parsed data. That's because it is defined in the external subset. Okay, I admit I didn't write an external subset, but it makes no difference because the XML processor does not try to read it because I haven't set $doc->resolveExternals=TRUE.

The XML processor should either stop on a fatal error or produce a correct DOM (that't a general rule for XML processors). But producing a wrong DOM is a no-no. BTW, if the &eacute; entity reference appears in the text instead of an attribute value, then the DOM is correctly built.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Fri May 16 10:01:26 2025 UTC