|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #78979 Changing the encoding breaks the XML.
Submitted: 2019-12-17 12:44 UTC Modified: 2020-06-16 07:13 UTC
Avg. Score:4.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:0 (0.0%)
From: vyfix at yahoo dot co dot jp Assigned: cmb (profile)
Status: Not a bug Package: DOM XML related
PHP Version: 7.4.0 OS: Windows, Linux
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
Block user comment
Status: Assign to:
Bug Type:
From: vyfix at yahoo dot co dot jp
New email:
PHP Version: OS:


 [2019-12-17 12:44 UTC] vyfix at yahoo dot co dot jp
Please see test script.

Test script:

const XML = '<?xml version="1.0" encoding="UTF-8"?>

$dom = new DOMDocument();


$dom->encoding = 'SHIFT_JIS';

$test = $xml = $dom->saveXML();

echo $test;

Expected result:
<?xml version="1.0" encoding="SHIFT_JIS"?>

Actual result:
<?xml version="1.0" encoding="SHIFT_JIS"?>


Pull Requests


AllCommentsChangesGit/SVN commitsRelated reports
 [2019-12-19 12:22 UTC] drtechno at mail dot com
Should be using  mb_convert_encoding

/* Convert UTF-8 to SHIFT_JIS */
$str = mb_convert_encoding($str, "SJIS", "UTF-8");

Just changing the attribute in the file will not change the encoding, and the backslash character that forms because the binary encoding isn't changed, some data is escaping.
 [2020-02-28 23:34 UTC]
Could you please add a simpler xml example? The one you gave is a bit unreadable. Would one tag suffice with a much smaller text content?
 [2020-02-29 10:13 UTC]
I can reproduce the reported behavior with PHP 7.3 on Windows
(libxml 2.9.10), but not on Linux (libxml 2.9.4).  This *might* be
an upstream issue.
 [2020-04-30 05:44 UTC] vyfix at yahoo dot co dot jp
More short code:

My windows, linux output are

>php test.php
            Original size -> 100000
                    BIG-5 -> 4051
                SHIFT-JIS -> 4047
                SHIFT_JIS -> 4047
                Shift_JIS -> 4047
                    UTF-8 -> 100000
                    ASCII -> 100000
               ISO-8859-1 -> 100000
 [2020-06-15 14:49 UTC]
-Status: Open +Status: Not a bug -Assigned To: +Assigned To: cmb
 [2020-06-15 14:49 UTC]
Okay, I tracked that regression down to commit 407b393[1] in
libxml2, so this is doesn't look like a bug in PHP.

[1] <>
 [2020-06-16 07:13 UTC]
Upstream bug report: <>.
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Wed Feb 05 13:01:33 2025 UTC