php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #41204 DOMDocument::loadXML() fails when entities included in string
Submitted: 2007-04-26 16:42 UTC Modified: 2007-04-28 21:28 UTC
From: jazzslider at gmail dot com Assigned: rrichards (profile)
Status: Not a bug Package: DOM XML related
PHP Version: 5.2.1 OS: Linux
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: jazzslider at gmail dot com
New email:
PHP Version: OS:

 

 [2007-04-26 16:42 UTC] jazzslider at gmail dot com
Description:
------------
When using loadXML() to fill in a DOMDocument object from a pre-existing XML string, entity references will cause the method to fail, even though there does not seem to be a way to explicitly define such entity references in the document prior to calling loadXML(), nor does it seem to be necessary to do so when creating the same document via DOMDocument methods directly.

Also note: I am using PHP 5.2.0-8, and cannot upgrade to 5.2.1 since my site is on a shared server.

Reproduce code:
---------------
<h1>DOMDocument Entity Test</h1>
<h2>Created Via DOM Methods</h2>
<div>
<?php
$dd1 = new DOMDocument();
$dd1_root = $dd1->createElement('span', 'This is the sample span.');
$dd1_root = $dd1->appendChild($dd1_root);
$dd1_nbsp = $dd1->createEntityReference('nbsp');
$dd1_nbsp = $dd1_root->appendChild($dd1_nbsp);
$str = $dd1->saveXML();
echo $str;
?>
</div>
<h2>Loaded From String</h2>
<?php
$dd2 = new DOMDocument();
$dd2->loadXML($str);
echo $dd2->saveXML();
?>

Expected result:
----------------
The content of the "Created Via DOM Methods" section should be identical to the content of the "Loaded From String" section.

Actual result:
--------------
With warnings enabled, the following message appears:

"Warning: DOMDocument::loadXML() [function.DOMDocument-loadXML]: Entity 'nbsp' not defined in Entity, line: 2 in ..."

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2007-04-27 14:24 UTC] jazzslider at gmail dot com
Further observation:

The problem is solved partially by adding a custom DTD reference to the beginning of the string (after the <?xml...?> processing instruction) containing mainly entity declarations.

When this is done, the only unexpected behavior left is that, if the XML string contains the &quot; entity, it WILL be substituted...i.e., the XML string:

===============================================================
<...PI and DOCTYPE...><span>&quot;Hello!&quot; This &amp; that are&nbsp;my very favorite things.</span>
===============================================================

will, after loadXML()ing it into a new DOMDocument(), cause saveXML() to produce the following output:

===============================================================
<...PI and DOCTYPE...><span>"Hello!" This &amp; that are&nbsp;my very favorite things.</span>
===============================================================

(this is all assuming, of course, that the DOCTYPE declaration includes all of the entities in the string in keeping with how they are conventionally understood in HTML)

It is strange that the &quot; entity should be automatically substituted while none of the others are.  It would make more sense to me if NONE of the entities were substituted; that's the behavior I'm looking for.
 [2007-04-27 15:41 UTC] jazzslider at gmail dot com
Also...the fact that adding a DOCTYPE makes it work better when loading XML from a string doesn't explain why no DOCTYPE is needed when the entity reference is inserted using createEntityReference() and appendChild().
 [2007-04-28 21:28 UTC] rrichards@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

createEntityReference does not verify that the entity has already been defined in a DTD. Loading(parsing) an XML document though does perform the check (so you need a DTD - either internal or external); so you can create a document that cannot be loaded correctly. The behavior is correct and has been verified against other DOM implementations to be sure.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Wed Apr 24 22:01:30 2024 UTC