php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #45611 Tidy does not repair HTML correctly
Submitted: 2008-07-23 20:05 UTC Modified: 2008-07-27 00:32 UTC
From: lwpku95 at gmail dot com Assigned:
Status: Not a bug Package: Tidy (PECL)
PHP Version: 5.2.6 OS: Windows XP
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: lwpku95 at gmail dot com
New email:
PHP Version: OS:

 

 [2008-07-23 20:05 UTC] lwpku95 at gmail dot com
Description:
------------
Tried to use tidy to repair a HTML string. But the result is not correct.


Reproduce code:
---------------
<?php
$html = '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html><head>  <meta content="text/html; charset=ISO-8859-1" http-equiv="content-type"><title></title></head>
<body><table style="text-align: left; width: 100%;" border="1" cellpadding="2" cellspacing="2">
	<tr>
	  Cell 1</td><td>Cell 2</td>	  <td>Cell 3
	</tr>
</table>
<br>
</body>
</html>';
$config = array('indent' => true);
$tidy = new tidy;
$tidy->parseString($html, $config);
$tidy->cleanRepair();
echo $tidy;
?>


Expected result:
----------------
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <meta content="text/html; charset=us-ascii" http-equiv=
    "content-type">
    <title></title>
  </head>
  <body>
    <table style="text-align: left; width: 100%;" border="1"
    cellpadding="2" cellspacing="2">
      <tr>
        <td>
          Cell 1
        </td>
        <td>
          Cell 2
        </td>
        <td>
          Cell 3
        </td>
      </tr>
    </table><br>
  </body>
</html>

Actual result:
--------------
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <meta content="text/html; charset=us-ascii" http-equiv=
    "content-type">
    <title></title>
  </head>
  <body>
    Cell 1
    <table style="text-align: left; width: 100%;" border="1"
    cellpadding="2" cellspacing="2">
      <tr>
        <td>
          Cell 2
        </td>
        <td>
          Cell 3
        </td>
      </tr>
    </table><br>
  </body>
</html>

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-07-27 00:32 UTC] jani@php.net
This is actually bug in the (lib)Tidy itself:

http://sourceforge.net/tracker/index.php?func=detail&aid=2026039&group_id=27659&atid=390963

 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Tue May 13 12:01:27 2025 UTC