php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #42391 DOM ignores several valid dtd entities.
Submitted: 2007-08-22 21:45 UTC Modified: 2007-08-23 18:49 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:0 (0.0%)
From: linus dot martensson at elplan-gm dot se Assigned:
Status: Not a bug Package: DOM XML related
PHP Version: 5.2.3 OS: Linux - Ubuntu Feisty Fawn
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: linus dot martensson at elplan-gm dot se
New email:
PHP Version: OS:

 

 [2007-08-22 21:45 UTC] linus dot martensson at elplan-gm dot se
Description:
------------
The DOM parser fails to parse SEVERAL valid xhtml entities, such as » and ⇒, even though both are specified in http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent and http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent.
These two files (among others) are referred to by the specified doctype definition, xhtml1-strict.dtd.
The parser is obviously not taking all valid xhtml entities into account, which is a serious problem.

Reproduce code:
---------------
<?php 
$d = new DOMDocument();
if(!$d->loadXML('<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html><head></head><body>&#8658;&rArr;</body></html>')) var_dump(libxml_get_last_error());

Expected result:
----------------
No output, should correctly parse the document and store the two entities in the DOMDocument.

Actual result:
--------------
When the libXml error is retrieved, this is the apparent error:
Line 1: Entity 'rArr' not defined. The parse is aborted.

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2007-08-22 22:01 UTC] linus dot martensson at elplan-gm dot se
Retouched the summary.
 [2007-08-22 22:49 UTC] rrichards@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

You need to pass the LIBXML_DTDLOAD option to loadXML as external subsets are not loaded by default
 [2007-08-23 18:49 UTC] linus dot martensson at elplan-gm dot se
Then, may I recommend clarifying the documentation? The &nbsp; entity was, for example, handled without that extra option, and it is in the SAME file as one of the entities that fail to load! *This* is why I found the behaviour strange and decided to report the bug. Would you please clarify such behaviour?


Linus
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Sat Jul 05 02:01:35 2025 UTC