php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #50661 DOMDocument::loadXML does not allow UTF-16
Submitted: 2010-01-04 20:58 UTC Modified: 2010-01-06 17:53 UTC
From: geoffers+phpbugs at gmail dot com Assigned: rrichards (profile)
Status: Closed Package: DOM XML related
PHP Version: 5.3SVN-2010-01-04 (SVN) OS: Mac OS 10.5.8
Private report: No CVE-ID: None
 [2010-01-04 20:58 UTC] geoffers+phpbugs at gmail dot com
Description:
------------
DOMDocument::loadXML() does not support UTF-16 encoded XML. This breaks the XML spec which says, "All XML processors MUST accept the UTF-8 and UTF-16 encodings of Unicode". As such, DOMDocument::loadXML() is not a conformant XML processor.

XMLReader supports this fine, which suggests something is wrong in the use of the libxml2 API.

Reproduce code:
---------------
<?php
$data = "\xFE\xFF\x00\x3C\x00\x66\x00\x6F\x00\x6F\x00\x2F\x00\x3E";

$dom = new DOMDocument();
$dom->loadXML($data);
echo $dom->saveXML();

Expected result:
----------------
<?xml version="1.0"?>
<foo/>

Actual result:
--------------
PHP Warning:  DOMDocument::loadXML(): Start tag expected, '<' not found in Entity, line: 1 in /Users/gsnedders/Desktop/foo.php on line 5

Warning: DOMDocument::loadXML(): Start tag expected, '<' not found in Entity, line: 1 in /Users/gsnedders/Desktop/foo.php on line 5
<?xml version="1.0"?>


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-01-04 23:16 UTC] rrichards@php.net
Assign to self
 [2010-01-06 13:13 UTC] svn@php.net
Automatic comment from SVN on behalf of rrichards
Revision: http://svn.php.net/viewvc/?view=revision&revision=293176
Log: fix bug #50661 (DOMDocument::loadXML does not allow UTF-16)
add test
 [2010-01-06 13:16 UTC] rrichards@php.net
This bug has been fixed in SVN.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.


 [2010-01-06 17:53 UTC] geoffers+phpbugs at gmail dot com
Null-terminated strings and UTF-16? fun. :) Thanks for fixing it!
 [2010-02-03 18:41 UTC] svn@php.net
Automatic comment from SVN on behalf of pajoye
Revision: http://svn.php.net/viewvc/?view=revision&revision=294436
Log: fix bug #50661 (DOMDocument::loadXML does not allow UTF-16)
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 15:01:29 2024 UTC