php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #45824 loadXML moves entity references from attribute value to before element
Submitted: 2008-08-14 17:59 UTC Modified: 2008-08-16 02:38 UTC
From: marc at mongenet dot ch Assigned:
Status: Not a bug Package: DOM XML related
PHP Version: 5.2.6 OS: Linux
Private report: No CVE-ID: None
 [2008-08-14 17:59 UTC] marc at mongenet dot ch
Description:
------------
When an attribute value contains an entity reference (like a="é"), loadXML() moves this entity out of the attribute, just before the owner element.

Reproduce code:
---------------
$doc = new DOMDocument();
$xml = '<!DOCTYPE e PUBLIC "1" "2">'."\n". # DOCTYPE just to appear well-formed
       '<e><e a="&eacute;"/></e>'."\n";
$doc->loadXML($xml);
echo '<pre>', htmlspecialchars($doc->saveXML()), '</pre>';


Expected result:
----------------
<?xml version="1.0"?>
<!DOCTYPE e PUBLIC "1" "2">
<e><e a="&eacute;"/></e>


Actual result:
--------------
<?xml version="1.0"?>
<!DOCTYPE e PUBLIC "1" "2">
<e>&eacute;<e a=""/></e>


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-08-15 06:34 UTC] chregu@php.net
You should read the warning, it produces:

Warning: DOMDocument::loadXML(): Entity 'eacute' not defined in Entity, 
line: 2 in /Users/chregu/tmp/foo.php on line 6

eacute is not a entity which is defined by default (there are only 5 of 
them)
 [2008-08-16 02:38 UTC] marc at mongenet dot ch
In the Extensible Markup Language (XML) 1.0 (Fourth Edition) W3C Recommendation, chapter 4.1 Character and Entity References, it is written:
"Note that non-validating processors are not obligated to to read and process entity declarations occurring in parameter entities or in the external subset; for such documents, the rule that an entity must be declared is a well-formedness constraint only if standalone='yes'."

What we have in this example is : "Entity 'eacute' is not defined in Entity" -> i.e. it is not defined in the parsed data. That's because it is defined in the external subset. Okay, I admit I didn't write an external subset, but it makes no difference because the XML processor does not try to read it because I haven't set $doc->resolveExternals=TRUE.

The XML processor should either stop on a fatal error or produce a correct DOM (that't a general rule for XML processors). But producing a wrong DOM is a no-no. BTW, if the &eacute; entity reference appears in the text instead of an attribute value, then the DOM is correctly built.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Jun 15 14:01:36 2024 UTC