| 
        php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login | 
  [2005-09-27 21:38 UTC] bugs dot php dot net at webdevelopers dot cz
 Description:
------------
HTML imported using DOMDocument::loadHTML() is not saved as valid XML using DOMDocument::saveXML()
Reproduce code:
---------------
header('Content-Type: text/plain; charset=UTF-8');
$d=new DOMDocument('1.0', 'UTF-8');
if ($d->loadHTML('<html><body><script><!-- var a=1; a--; --></script></body></html>')) {
  echo "loadHTML: OK\n";
  echo $d->loadXML($d->saveXML());
}
Expected result:
----------------
loadHTML: OK
1
Actual result:
--------------
loadHTML: OK
<br />
<b>Warning</b>:  DOMDocument::loadXML() [<a href='function.loadXML'>function.loadXML</a>]: Comment not terminated 
<!-- var a=1; a in Entity, line: 3 in <b>test.php</b> on line <b>9</b><br />
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits             
             | 
    |||||||||||||||||||||||||||
            
                 
                Copyright © 2001-2025 The PHP GroupAll rights reserved.  | 
        Last updated: Tue Nov 04 14:00:01 2025 UTC | 
The bug is that I have a DOM document that was imported from HTML (actually I'm having the object that handles the DOM and I may not be aware of the way how it was created...). When I have valid DOM document then I expect (or I thought that I can expect) that I'll produce valid XML using the method $d->saveXML() and the result can be loaded again without problem with loadXML()... This is not the true. Following function may fail under certain circumstances when the DOMDocument was created using loadHTML(): function test(DOMDocument $d) { echo DOMDocument::loadXML($d->saveXML()); } I'm not usre if it is not confusing if the programmer cannot be sure that the saveXML() is not compatible with loadXML()... --- Yesterday when I was playing with loadHTML() and then using the saveXML() it even resulted in XML like this <html xmlns="someNS" xmlns="someNS">...</html> - it has duplicite attribute @xmlns which is not a XML so saveXML() is more saveSometimesXML(). Obviosly when importing the HTML then the @xmlns is treated as common attribute and on saving there is added namespace declaration alongside @xmlns with existing @xmlns attribute. It looks like saveXML() is not producing XML despite the name of the function. I think it is a bug. Don't you think? Or is it meant functionality and just the function's name is confusing me?