php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #51484 '--' incorrectly allowed inside comments
Submitted: 2010-04-05 23:48 UTC Modified: 2010-04-07 14:25 UTC
From: ifland at gmail dot com Assigned:
Status: Not a bug Package: XML related
PHP Version: 5.2.13 OS: *
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: ifland at gmail dot com
New email:
PHP Version: OS:

 

 [2010-04-05 23:48 UTC] ifland at gmail dot com
Description:
------------
According to the XML spec (see http://www.w3.org/TR/2008/REC-xml-20081126/#sec-comments ), comments in XML are not allowed to contain two hyphens in a row, which can occasionally surface when processing poorly-formed HTML documents as input.

No suggestion is given in the spec for how to deal with the situation - we can't turn the hyphens into entities (those aren't allowed in comments either), but Firefox and possibly other browsers will fail to parse XML documents with the double hyphen.



Test script:
---------------
<?php
$doc = new DOMDocument();
$doc->loadHTML("<html><body><!--comment <!--sketchy commented comment--></body>");
header("Content-type: text/plain");
echo $doc->saveXML();
?>

Expected result:
----------------
Either a catchable error or something like this:

<?xml version="1.0" standalone="yes"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><!--comment <!- -commented comment--></body></html>


Actual result:
--------------
<?xml version="1.0" standalone="yes"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><!--comment <!--commented comment--></body></html>


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-04-07 14:25 UTC] iliaa@php.net
-Status: Open +Status: Bogus
 [2010-04-07 14:25 UTC] iliaa@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

The handling of this is done by libxml2 and not PHP, also you are using loadHTML() 
which is designed to handle non-well-formed HTML.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Tue Jul 01 22:01:36 2025 UTC