|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #71805 XML files can generate UTF-8 error even if they are UTF-8
Submitted: 2016-03-11 21:22 UTC Modified: 2016-05-15 04:38 UTC
Avg. Score:4.0 ± 1.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:0 (0.0%)
From: goozak at gmail dot com Assigned: ab (profile)
Status: Closed Package: XML Reader
PHP Version: 5.5Git-2016-03-11 (snap) OS: Win7 SP1
Private report: No CVE-ID: None
 [2016-03-11 21:22 UTC] goozak at gmail dot com
XML Reader generate error
"Warning: XMLReader::readOuterXml(): .....:5: parser error : Input is not proper UTF-8, indicate encoding ! Bytes: 0xC3 0xA9 0x78 0x20"
even though the file is properly encoded in UTF-8.

A small change to the XML content can be made (remove a tag or even just some characters) and the file is then read properly.

Test script:

Two files are used in the demo.  Result will be the same if they are downloaded locally.


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2016-03-14 15:01 UTC]
-Status: Open +Status: Verified -Assigned To: +Assigned To: ab
 [2016-03-14 15:01 UTC]
Thanks for the report. Looks like we see a regression in libxml 2.9.3, see . No fix to this is currently available in the libxml2 repo. I'll keep an eye and will apply this patch to our dependency base as soon as it's available.

 [2016-05-03 16:10 UTC]
-Status: Verified +Status: Feedback
 [2016-05-03 16:10 UTC]
Dependency packages are upgraded. Please check the upcoming RCs.

 [2016-05-03 18:23 UTC] goozak at gmail dot com
I'm on Windows, so I'll check the 5.6 RC when available.  The 5.5 one seems stuck on 5.5.27RC1 (2015-Jun-24)
 [2016-05-13 14:10 UTC] goozak at gmail dot com
Bug fixed in PHP 5.6 (5.6.22RC1) VC11 x86 Thread Safe (2016-May-12 21:22:39).
No RC for PHP 5.5.
 [2016-05-15 04:22 UTC] php-bugs at lists dot php dot net
No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Re-Opened". Thank you.
 [2016-05-15 04:38 UTC]
-Status: No Feedback +Status: Closed
 [2016-05-15 04:38 UTC]
Nothing for 5.5 because it's only receiving security fixes.
 [2016-05-15 09:58 UTC] goozak at gmail dot com
I did provide feedback on [2016-05-13 14:10 UTC] - the bug is fixed in the latest 5.6 RC.

And I do hope that this regression introduced in 5.5.32 (one of the recent security fix) will still get fixed...
 [2016-05-26 15:17 UTC] goozak at gmail dot com
For what it's worth (since bug was closed for nebulous reasons and I can't reopen it), the bug is also fixed in the new 5.5.36 Windows build.
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Wed Feb 21 02:01:28 2024 UTC