php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #74628 DOM auto remove HTML closing tag in <script> tag when save
Submitted: 2017-05-22 04:27 UTC Modified: 2017-05-22 16:37 UTC
Votes:9
Avg. Score:4.2 ± 0.9
Reproduced:7 of 7 (100.0%)
Same Version:4 (57.1%)
Same OS:1 (14.3%)
From: tidus2102 at gmail dot com Assigned:
Status: Suspended Package: DOM XML related
PHP Version: 7.1.5 OS: OSX El Capitan
Private report: No CVE-ID: None
 [2017-05-22 04:27 UTC] tidus2102 at gmail dot com
Description:
------------
http://stackoverflow.com/questions/4029341/dom-parser-that-allows-html5-style-in-script-tag

DOM auto remove HTML closing tag in <script> tag when save. Please check test script below.


Test script:
---------------
$html = <<<HTML
<script>
    console.log('<h1>aaa</h1>');
</script>
HTML;
        $dom = new DOMDocument();
        libxml_use_internal_errors(true);
        $dom->loadHTML(mb_convert_encoding($html, 'HTML-ENTITIES', "UTF-8"), LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);

        echo $dom->saveHTML();
        die;

Expected result:
----------------
All content in <script> tag need to be exactly the same as original HTML code.

<script>
    console.log('<h1>aaa</h1>');
</script>

Actual result:
--------------
<script>
    console.log('<h1>aaa');
</script>

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2017-05-22 16:37 UTC] cmb@php.net
-Status: Open +Status: Suspended
 [2017-05-22 16:37 UTC] cmb@php.net
Unfortunately, libxml isn't really HTML5 compatible yet, and there
isn't much we can do about that. You have to stick to conforming
HTML 4.01, and write <\/h1>, see <https://3v4l.org/ldGmr>.
 [2021-03-04 11:34 UTC] gijovarghese141 at gmail dot com
Adding LIBXML_SCHEMA_CREATE to loadHTML() as an option will fix the issue

$dom->loadHTML($html, LIBXML_SCHEMA_CREATE);
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Mon Dec 30 14:01:28 2024 UTC