php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #43260 DOM doesn't produce valid XHTML.
Submitted: 2007-11-12 13:22 UTC Modified: 2021-03-21 04:22 UTC
Votes:5
Avg. Score:3.8 ± 1.6
Reproduced:2 of 4 (50.0%)
Same Version:2 (100.0%)
Same OS:1 (50.0%)
From: dhopkins at mutinydesign dot co dot uk Assigned: cmb (profile)
Status: No Feedback Package: DOM XML related
PHP Version: 5.2CVS-2007-11-12 (snap) OS: Ubuntu Server
Private report: No CVE-ID: None
 [2007-11-12 13:22 UTC] dhopkins at mutinydesign dot co dot uk
Description:
------------
This issue is similar to a similar one posted earlier ( http://bugs.php.net/bug.php?id=31130 ).

Basically, I want to use the DOM functions to return valid XHTML - with trailing slashes - but the saveHTML method returns HTML 4.0 loose. As was pointed out in the above post, you can just use the saveXML method to return valid XHTML. However, there is a big problem with this.

If you have a XHTML document that starts with the xml decleration <?xml?>, Internet Explorer 6 doesn't read the HTML DTD. The result of which is a skewed document, various CSS attributes are not supported etc. You are basically trying to display a HTML document with an XML doctype.

Currently, I am just trimming the xml decleration off the document, which I am sure you can appreciate is not a very good solution, but its the only option.

Is it possible to add a feature that will return XML without the XML decleration? Or is this really something I need to take up with LibXML? There are quite a few people moaning about this issue, but so far no solutions has been provided.

Reproduce code:
---------------
$doc = new DomDocument();
$input = $doc->createElement('input');
$input->setAttribute('type', 'checkbox');
$input->setAttribute('checked', 'checked');
$doc->appendChild($input);
echo $doc->saveXML();

Expected result:
----------------
<input type="checkbox" checked="checked" />

Actual result:
--------------
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<input type="checkbox" checked />

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2007-11-12 14:49 UTC] rrichards@php.net
IE6 isn't fully XHTML compliant and xml declaration is perfectly valid.

Assigning to self as support for save options (including ability to suppress declaration) are currently in progress. Not yet fully integrated into DOM extension as this requires fairly recent libxml2 to work.
 [2007-11-18 17:31 UTC] missingno at ifrance dot com
In fact, there's already a constant (both in PHP & libxml >= 2.6.21) to drop the XML declaration, but it seems it can't be used in any of the save*() methods of the DOM extension.

Definition of the PHP constant (LIBXML_NOXMLDECL):
http://cvs.php.net/viewvc.cgi/php-src/ext/libxml/libxml.c?annotate=1.65#l630

Implementation of DOM's SaveXML (dom_document_savexml)
http://cvs.php.net/viewvc.cgi/php-src/ext/dom/document.c?annotate=1.88#l1681

The LIBXML_NOXMLDECL can be used when loading XML data from a file or a string (see DOM Document's load()/loadXML()).
Maybe this constant should also be available in the options passed to save()/saveXML().

However, you might be able to achieve the same effect by:
- loading your HTML content in a DOMDocument,
- loading a dummy XML file/string (such as "<html />", passing LIBXML_NOXMLDECL as an option to the load method) in a second DOMDocument (we will refer to it as "the XML DOMDocument")
- using importNode($node, TRUE) to deep-copy & import the HTML content in the XML DOMDocument
- use save()/saveXML() on the XML DOMDocument to dump the resulting tree
I didn't test this workaround (yet), though.

Also, please note that serving XML as HTML is often considered harmful for the Internet. There's also a lot of concerns regarding Appendix C of the XHTML 1.0 Spec (dealing with "HTML-Compatible XHTML markup").

Hope this helps
 [2007-11-18 17:35 UTC] missingno at ifrance dot com
Yet another possible workaround (also untested):

When you're done preparing your markup:
- use DOMDocument->saveXML() to dump the tree to a string,
- create another DOMDocument, use loadXML() with LIBXML_NOXMLDECL on that string
- finally, dump this new DOMDocument's tree to a file/string using the usual save/saveXML methods.
 [2010-12-20 14:23 UTC] jani@php.net
-Package: Feature/Change Request +Package: DOM XML related
 [2013-10-10 07:47 UTC] datibbaw@php.net
If you want to skip the XML declaration, just pass an argument to `->saveXML`, i.e.

 echo $doc->saveXML($doc->firstChild);
 [2017-10-24 06:15 UTC] kalle@php.net
-Status: Assigned +Status: Open -Assigned To: rrichards +Assigned To:
 [2021-03-12 12:50 UTC] cmb@php.net
-Status: Open +Status: Feedback -Assigned To: +Assigned To: cmb
 [2021-03-12 12:50 UTC] cmb@php.net
This feature request appears to be obsolete, isn't it?
 [2021-03-21 04:22 UTC] php-bugs at lists dot php dot net
No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Re-Opened". Thank you.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Sep 14 17:01:28 2024 UTC