php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #71993 DOMDocument::loadHTML ignores anything after a null byte
Submitted: 2016-04-08 19:18 UTC Modified: 2016-04-13 07:29 UTC
From: raul at raulr dot net Assigned:
Status: Not a bug Package: DOM XML related
PHP Version: 7.0.5 OS: Linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: raul at raulr dot net
New email:
PHP Version: OS:

 

 [2016-04-08 19:18 UTC] raul at raulr dot net
Description:
------------
DOMDocument::loadHTML() stops processing the source string after a null byte (U+0000) without issuing any exception or warning.

I have tested that this happens in PHP versions 5.5.34, 5.6.20 and 7.0.5.

Test script:
---------------
<?php

$html = <<<EOD
<!DOCTYPE html>
<html>
<div>Pre NULL byte</div>
\0
<div>Post NULL byte</div>
</html>
EOD;

$dom = new \DOMDocument('1.0');
$dom->validateOnParse = true;
$dom->loadHTML($html);
echo $dom->saveHTML();

Expected result:
----------------
<!DOCTYPE html>
<html><body><div>Pre NULL byte</div>

<div>Post NULL byte</div>
</body></html>

Actual result:
--------------
<!DOCTYPE html>
<html><body><div>Pre NULL byte</div></body></html>

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2016-04-13 07:29 UTC] ab@php.net
-Status: Open +Status: Not a bug
 [2016-04-13 07:29 UTC] ab@php.net
Thanks for the report. The input comes as a binary safe string in PHP, then we reevaluate it using xmlStrlen(). But even without it, libxml2 will retain same behavior just silently stopping. Thus it is not a bug in PHP, this should be reevaluated once libxml2 has a better HTML5support. Please see also this couple of links

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=574104
https://mail.gnome.org/archives/xml/2008-August/msg00008.html

Thanks.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Apr 25 23:01:29 2024 UTC