|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #66712 Using LIBXML_HTML_NOIMPLIED on DomDocument::loadHTML() gives unexpected results
Submitted: 2014-02-14 01:23 UTC Modified: 2015-04-15 14:46 UTC
Avg. Score:4.1 ± 0.9
Reproduced:7 of 7 (100.0%)
Same Version:1 (14.3%)
Same OS:1 (14.3%)
From: chanson at mesd dot k12 dot or dot us Assigned:
Status: Verified Package: DOM XML related
PHP Version: 5.5.9 OS: Fedora 20 x86/64
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Bug Type:
From: chanson at mesd dot k12 dot or dot us
New email:
PHP Version: OS:


 [2014-02-14 01:23 UTC] chanson at mesd dot k12 dot or dot us
Using the LIBXML_HTML_NOIMPLIED predefined constant in the DomDocument class has unexpected results.

The nodeValue of any first DOMNodeList item always contains all the values of every node list item in the collection only when the optional LIBXML_HTML_NOIMPLIED predefined constant is passed to the loadHTML() method.

I am currently running:
- PHP Version 5.5.8
- libxml Version 2.9.1

Test script:
$html = '<h1>Foo</h1><h2>Bar</h2><p>lorem ipsum</p>';
$dom = new \DomDocument();
$nodes = $dom->getElementsByTagName('*');
echo $nodes->item(0)->tagName . ' -> ' . $nodes->item(0)->nodeValue . '<br/>';
foreach ($nodes as $node) {
    echo $node->tagName . ' -> ' . $node->nodeValue . '<br/>';

Expected result:
h1 -> FooBarlorem ipsum
h1 -> FooBarlorem ipsum
h2 -> Bar
p -> lorem ipsum

Actual result:
h1 -> Foo
h1 -> Foo
h2 -> Bar
p -> lorem ipsum


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2015-04-15 14:41 UTC]
-Summary: Using LIBXML_NOHTML_IMPLIED on DomDocument::loadHTML() gives unexpected results +Summary: Using LIBXML_HTML_NOIMPLIED on DomDocument::loadHTML() gives unexpected results -Status: Open +Status: Feedback -Assigned To: +Assigned To: cmb
 [2015-04-15 14:41 UTC]
I am not able to reproduce this behavior, see
<>. Can you confirm?
 [2015-04-15 14:46 UTC]
-Status: Feedback +Status: Verified -Assigned To: cmb +Assigned To:
 [2015-04-15 14:46 UTC]
Well, of course the behavior is reproducible -- only the expected
and actual behavior sections in the report are mixed up.
 [2015-11-09 14:47 UTC] dlundgren at syberisle dot net
This may be more of a documentation issue than a code issue. After encountering the same problem recently, I found that the first element becomes the root element of the document. To offset this I wrapped the html fragment in a root element, and I was able to work with it that way.

This is most likely due to our lack of understanding that the DOM Level 2 requires a document to have a single documentElement, with children under it. I had to look that up to understand what I was doing wrong.
PHP Copyright © 2001-2023 The PHP Group
All rights reserved.
Last updated: Tue May 30 02:03:44 2023 UTC