|   | php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login | 
| 
  [2021-01-25 10:18 UTC] andrey at email dot dp dot ua
 Description: ------------ Description ------------ When DOMDocument is cloned, properties are cloned incorrectly. saveHTML method of the cloned object provides different results as the same method of original object. saveHTML of the cloned object launched with additional documentElement parameter provides result with symbols converted to numeric character references. But saveHTML launched without parameters returns correct result Properties that are corrupted during cloning --------------------------------------------- $DOMDocument->nodeType if original object has nodeType XML_HTML_DOCUMENT_NODE, after cloning it will be set to XML_DOCUMENT_NODE $DOMDocument->baseURI value is lost during cloning $DOMDocument->version if not set on original object will be set to 1.0 $DOMDocument->xmlVersion if not set on original object will be set to 1.0 Methods that has different result on cloned object ------------------------------------------------- "Carriage-return" symbols in original document correctly returned by $DOMDocument->saveHTML() method, but replaced with on 
 when used $DOMDocument->saveHTML($DOMDocument->documentElement) on cloned object. Test script: --------------- <?php $html = "<html><head><base href='https://php.net'></head><body>\r</body></html>"; $dom = new DOMDocument(); $dom->loadHTML($html); $arr = array( 'DOMDocument' => $dom, 'cloned by clone' => clone $dom, 'cloned by cloneNode' => $dom->cloneNode(true) ); foreach ($arr as $descr=>$obj) { echo $descr.":\n"; echo "--------------------------\n"; echo "saveHTML:\n"; echo $obj->saveHTML()."\n\n"; echo "saveHTML via DOMDocument::documentElement:\n"; echo $obj->saveHTML($obj->documentElement)."\n\n"; echo "\$DOMDocument->nodeType = ".$obj->nodeType."\n"; echo "\$DOMDocument->baseURI = ".$obj->baseURI."\n"; echo "\$DOMDocument->version = ".$obj->version."\n"; echo "\$DOMDocument->xmlVersion = ".$obj->xmlVersion."\n\n\n"; } Expected result: ---------------- [ three times ] saveHTML: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> <html><head><base href="https://php.net"></head><body> </body></html> saveHTML(documentElement): <html><head><base href="https://php.net"></head><body> </body></html> $DOMDocument->nodeType = 13 $DOMDocument->baseURI = https://php.net $DOMDocument->version = $DOMDocument->xmlVersion = Actual result: -------------- DOMDocument: -------------------------- saveHTML: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> </body></html>ase href="https://php.net"></head><body> saveHTML(documentElement): </body></html>ase href="https://php.net"></head><body> $DOMDocument->nodeType = 13 $DOMDocument->baseURI = https://php.net $DOMDocument->version = $DOMDocument->xmlVersion = cloned by clone: -------------------------- saveHTML: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> </body></html>ase href="https://php.net"></head><body> saveHTML(documentElement): <html><head><base href="https://php.net"></head><body>
</body></html> $DOMDocument->nodeType = 9 $DOMDocument->baseURI = $DOMDocument->version = 1.0 $DOMDocument->xmlVersion = 1.0 cloned by cloneNode: -------------------------- saveHTML: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> </body></html>ase href="https://php.net"></head><body> saveHTML(documentElement): <html><head><base href="https://php.net"></head><body>
</body></html> $DOMDocument->nodeType = 9 $DOMDocument->baseURI = $DOMDocument->version = 1.0 $DOMDocument->xmlVersion = 1.0 PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits             | |||||||||||||||||||||||||||
|  Copyright © 2001-2025 The PHP Group All rights reserved. | Last updated: Fri Oct 31 02:00:02 2025 UTC | 
Hello, I come to confirm that the request was merged on the master branch libxml2. "does this need to handle the other discrepancies between the original and cloned nodes mentioned in the PHP bug report?" ------------------------------------------------------------------- @Philip, Yes it is. "What is it DOMDocument after some document was loaded by loadXML?" ------------------------------------------------------------------- @Andrey, I know than the xml déclaration(<?xml version="1.0"?>) is not part of DOM document.The purpose of the declaration is to prepare the agent to read the document. "What is the DOMDocument? What is describes? What it corresponds with?" ----------------------------------------------------------------------- The implementation of the domdocument extension informs us that this is the internal tree structure of libxml. DOMDocument is therefore an xml representation. By example, What DOMDocument is not : // file "model-verbose.xml" <?xml version="1.0"?> <object type="Document"> <property name="lang">fr</property> <property name="author">serge</property> <property name="update">09/02/21</property> <child> <object type="label"> <property name="lang">en</property> <property name="text">Hello World !</property> </object> </child> </object> // file "model.xml" <document lang="fr" author="serge" update="09/02/21"> <label>Hello World !</label> </document> // file "model.json" { document: { properties: [ lang: "fr", author: "Serge", update: "09/02/21", ], children: [ { label: { content: "Hello World !"} } ], } } $dom_v = My\Ext\Dom\Document::load("model-verbose.xml", $verbose); $dom = My\Ext\Dom\Document::load("model.xml"); $dom_j = My\Ext\Dom\Document::loadJson("model.json"); if($dom_v->root["lang"] == $dom->root["lang"]) { echo 'Same access, same Object Model' . PHP_EOL; } if($dom_v->root->children[0]["text"] == $dom->root->children[0]["text"]) { echo 'Same access, same Object Model' . PHP_EOL; } It's not the same xml( representation), but the same mydocument object model. kind regards, Serge