php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #34730 saveXML translates UTF-8 characters to entities
Submitted: 2005-10-04 14:46 UTC Modified: 2005-10-04 19:28 UTC
From: pesmail2003 at seznam dot cz Assigned:
Status: Not a bug Package: DOM XML related
PHP Version: 5.0.5 OS: Linux 2.6.10-5-i386
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: pesmail2003 at seznam dot cz
New email:
PHP Version: OS:

 

 [2005-10-04 14:46 UTC] pesmail2003 at seznam dot cz
Description:
------------
saveXML function replaces UTF-8 characters to entities. If I have some UTF-8 characters in my loaded XML file and want to save it with saveXML, it replace everything to entities like this č ř ž ýá í é I had to write my own saveXML in PHP and it's really bad.

Reproduce code:
---------------
<?php
$xmlData = <<<XMLCODE
<?xml version="1.0" ?>
<test>"?????????</test>
XMLCODE;

$dom = new DomDocument;
$dom->loadXML( $xmlData );
echo $dom->saveXML();
?>

Expected result:
----------------
<?xml version="1.0"?>
<test>"?????????</test>

Actual result:
--------------
<?xml version="1.0"?>
<test>"&#x11B;&#x161;&#x10D;&#x159;&#x17E;&#xFD;&#xE1;&#xED;&#xE9;</test>

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-10-04 14:53 UTC] derick@php.net
You need to specify the charset encoding in your XML, like so:

<?php
$xmlData = <<<XMLCODE
<?xml version="1.0" encoding="utf8"?>
<test>"привате</test>
XMLCODE;

$dom = new DomDocument;
$dom->loadXML( $xmlData );
echo $dom->saveXML();
?>

 [2005-10-04 14:57 UTC] pesmail2003 at seznam dot cz
Thanks derick for fast response. But I found something interesting. When I set encoding in XML with <?xml version="1.0" encoding="UTF-8"?>, it works OK. But there is still no way to disable replacing special characters with entities when I have proper encoding and want to use saveXML with parameter "node" to save only part of the tree. Any solution to this?
 [2005-10-04 15:02 UTC] derick@php.net
Not without an example, but as this is not a user forum, please ask this kinds of questions on the php-general@lists.php.net mailinglist.
 [2005-10-04 15:14 UTC] pesmail2003 at seznam dot cz
I'm sending updated test code. Plase don't throw this bugreport immediately away as "bogus". I really don't want to use this as support forum, and don't want to stupidly chat with developers. Maybe I sent bad example for the first time, but I'm trying to do what I can to help you to help me. Maybe we will find I'm dumb but maybe we will find and fix some problem together.

Reproduce code
--------------
<?php
$xmlData = <<<XMLCODE
<?xml version="1.0" encoding="utf-8"?>
<test>"????????</test>
XMLCODE;

$dom = new DomDocument;
$dom->loadXML( $xmlData );
echo $dom->saveXML($dom->firstChild);
?>

Expected result
---------------
<test>"?????????</test>

Actual result
---------------
<test>"&#x161;&#x10D;&#x159;&#x17E;&#xFD;&#xE1;&#xED;&#xE9;</test>
 [2005-10-04 18:56 UTC] chregu@php.net
works for me as expected (php 5.0.6-dev)

which libxml2 version are you using
 [2005-10-04 19:26 UTC] pesmail2003 at seznam dot cz
Thanks a lot chregu! I though I have latest version, but I hadn't. I've upgraded to libxml 2.6.21 (was 2.6.17) and now it works as expected. My fault I didn't test it before posting bug report. Thanks everyone, great job.
 [2005-10-04 19:28 UTC] tony2001@php.net
Not PHP problem -> bogus.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Fri Jul 04 09:01:34 2025 UTC