php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #38483 DOM fragments ignore document encoding
Submitted: 2006-08-17 14:04 UTC Modified: 2006-08-17 14:41 UTC
From: webmaster at bison-soft dot de Assigned: rrichards (profile)
Status: Not a bug Package: DOM XML related
PHP Version: 5.1.4 OS: linux 2.4
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: webmaster at bison-soft dot de
New email:
PHP Version: OS:

 

 [2006-08-17 14:04 UTC] webmaster at bison-soft dot de
Description:
------------
An XML-String appended to a DOMDocumentFragment with
appendXML() must always be encoded with UTF-8, regardless
of the choosen DOMDocument encoding. Even worse, after
appending the fragment to the document, the charset in
the fragment remains UTF-8, which leads to a DOMDocument
with mixed encodings inside.

Reproduce code:
---------------
$dom = new DOMDocument('1.0','ISO-8859-1');
$dom->loadXML('<parent />');
$frag = $dom->createDocumentFragment();
$frag->appendXML('<child>???</child>');
$dom->documentElement->appendChild($frag);


Expected result:
----------------
the ISO-8859-1 encoded XML-String will cleanly import into
the document fragment.

Actual result:
--------------
Error:

DOMDocumentFragment::appendXML() Entity: line 1: parser error : Input is not proper UTF-8, indicate encoding !!

When trying to use an XML-declaration containing the ISO-8859-1 encoding in the child-XML-String:

Entity: line 1: parser error : XML declaration allowed only at the start of the document

When you convert the child-XML-String to utf8 and append it, you have a DOMDocument with mixed charset

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-08-17 14:41 UTC] rrichards@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

string passed to appendXML() must be utf-8 encoded and doc is not mixed - loadxml() creates a new document defaulting to utf-8 encoding since you do not have an xml decl.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Tue Jul 01 19:01:37 2025 UTC