php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #50661 DOMDocument::loadXML does not allow UTF-16
Submitted: 2010-01-04 20:58 UTC Modified: 2010-01-06 17:53 UTC
From: geoffers+phpbugs at gmail dot com Assigned: rrichards (profile)
Status: Closed Package: DOM XML related
PHP Version: 5.3SVN-2010-01-04 (SVN) OS: Mac OS 10.5.8
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: geoffers+phpbugs at gmail dot com
New email:
PHP Version: OS:

 

 [2010-01-04 20:58 UTC] geoffers+phpbugs at gmail dot com
Description:
------------
DOMDocument::loadXML() does not support UTF-16 encoded XML. This breaks the XML spec which says, "All XML processors MUST accept the UTF-8 and UTF-16 encodings of Unicode". As such, DOMDocument::loadXML() is not a conformant XML processor.

XMLReader supports this fine, which suggests something is wrong in the use of the libxml2 API.

Reproduce code:
---------------
<?php
$data = "\xFE\xFF\x00\x3C\x00\x66\x00\x6F\x00\x6F\x00\x2F\x00\x3E";

$dom = new DOMDocument();
$dom->loadXML($data);
echo $dom->saveXML();

Expected result:
----------------
<?xml version="1.0"?>
<foo/>

Actual result:
--------------
PHP Warning:  DOMDocument::loadXML(): Start tag expected, '<' not found in Entity, line: 1 in /Users/gsnedders/Desktop/foo.php on line 5

Warning: DOMDocument::loadXML(): Start tag expected, '<' not found in Entity, line: 1 in /Users/gsnedders/Desktop/foo.php on line 5
<?xml version="1.0"?>


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-01-04 23:16 UTC] rrichards@php.net
Assign to self
 [2010-01-06 13:13 UTC] svn@php.net
Automatic comment from SVN on behalf of rrichards
Revision: http://svn.php.net/viewvc/?view=revision&revision=293176
Log: fix bug #50661 (DOMDocument::loadXML does not allow UTF-16)
add test
 [2010-01-06 13:16 UTC] rrichards@php.net
This bug has been fixed in SVN.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.


 [2010-01-06 17:53 UTC] geoffers+phpbugs at gmail dot com
Null-terminated strings and UTF-16? fun. :) Thanks for fixing it!
 [2010-02-03 18:41 UTC] svn@php.net
Automatic comment from SVN on behalf of pajoye
Revision: http://svn.php.net/viewvc/?view=revision&revision=294436
Log: fix bug #50661 (DOMDocument::loadXML does not allow UTF-16)
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Dec 03 05:01:32 2024 UTC