|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #79090 DomDocument in recover mode improperly strips amp after error in XML
Submitted: 2020-01-09 18:01 UTC Modified: 2020-01-16 08:45 UTC
From: danielkarp at gmail dot com Assigned:
Status: Wont fix Package: DOM XML related
PHP Version: 7.4.1 OS: CentOS
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please — but make sure to vote on the bug!
Your email address:
Solve the problem:
12 + 1 = ?
Subscribe to this entry?

 [2020-01-09 18:01 UTC] danielkarp at gmail dot com
Using DomDocument to recover malformed XML (mismatched tags). In an XML file, AFTER mismatched tags are repaired (but not before), & is improperly stripped from the file.
Note, I selected 7.4.1 above, but I've tested this only in 7.4.0. But this is a longstanding bug going back to at least 7.2.

You can also see an example here:

Test script:
$d = new DomDocument();
$domDocument = new DOMDocument( '1.0', 'utf-8' );
$domDocument->recover = true;
$xml = <<<EOT
<?xml version="1.0" encoding="utf-8"?>
<xml> Amp: &amp;
<foo> baz: <bar>foobar</foo></bar>
No amp? &amp;
$domDocument->loadXML( $xml );
echo "<br><br>";
echo htmlspecialchars( $domDocument->saveXml());

Expected result:
<?xml version="1.0" encoding="utf-8"?> <xml> Amp: &amp; <foo> baz: <bar>foobar</bar></foo> No amp? &amp; </xml>

Actual result:
<?xml version="1.0" encoding="utf-8"?> <xml> Amp: &amp; <foo> baz: <bar>foobar</bar></foo> No amp? </xml>


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2020-01-16 08:45 UTC]
-Status: Open +Status: Wont fix
 [2020-01-16 08:45 UTC]
I can confirm this bug - but it is not caused by PHP, this is apparently a bug in libxml itself, which you can verify by using `xmllint --recover` on the cli. It might be a duplicate of
PHP Copyright © 2001-2022 The PHP Group
All rights reserved.
Last updated: Sun Nov 27 11:05:54 2022 UTC