php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #60142 XSLTProcessor::transformToXML does not output UTF-8
Submitted: 2011-10-26 16:11 UTC Modified: 2013-12-02 14:19 UTC
Votes:4
Avg. Score:4.2 ± 0.8
Reproduced:2 of 3 (66.7%)
Same Version:2 (100.0%)
Same OS:2 (100.0%)
From: maximeraoust at gmail dot com Assigned:
Status: Not a bug Package: XSLT related
PHP Version: 5.3.8 OS: any
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: maximeraoust at gmail dot com
New email:
PHP Version: OS:

 

 [2011-10-26 16:11 UTC] maximeraoust at gmail dot com
Description:
------------
XSLTProcessor::transformToXML never seem to output UTF-8 data. 

Bug was already reported (https://bugs.php.net/bug.php?id=36415 & 
https://bugs.php.net/bug.php?id=36407&edit=2) but not enough information were 
provided and reports were closed. So here's attached a complete test.

I'm running PHP 5.3.8 VC9 + Apache 2.2.21 VC9 on Windows 7 Professional SP1.

I also tested on 5.3.2 under Ubuntu 10.04.3 and it works fine.



Test script:
---------------
<?php
$xslt = '<?xml version="1.0" encoding="utf-8"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"><xsl:output method="text" /></xsl:stylesheet>';
$inputText = 'UTF-8 text with accents: éèàç';

$doc = new DOMDocument('1.0', 'UTF-8');
$doc->loadHTML($inputText);

$xsl = new DOMDocument('1.0', 'UTF-8');
$xsl->loadXML($xslt);

$proc = new XSLTProcessor();
$proc->importStylesheet($xsl);

echo $proc->transformToDoc($doc)->saveXML();
?>

Expected result:
----------------
It should output (in CLI):

<?xml version="1.0"?>
UTF-8 text with accents: &#xE9;&#xE8;&#xE0;&#xE7;

Or if displayed in browser:

UTF-8 text with accents: éèàç

Actual result:
--------------
It outputs:

<?xml version="1.0"?>
UTF-8 text with accents: &#xC3;&#xA9;&#xC3;&#xA8;&#xC3;&#xA0;&#xC3;&#xA7;

Or if displayed in browser:

UTF-8 text with accents: éèà ç

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2011-10-26 16:13 UTC] maximeraoust at gmail dot com
In the test script, you can also put:

echo $proc->transformToXml(); to get the output, same result
 [2011-10-26 16:16 UTC] maximeraoust at gmail dot com
Sorry, I obviously meant:

echo $proc->transformToXml($doc);
 [2011-10-26 16:32 UTC] maximeraoust at gmail dot com
Fixed title with "Windows"
 [2011-10-26 16:32 UTC] maximeraoust at gmail dot com
-Summary: XSLTProcessor::transformToXML does not output UTF-8 +Summary: XSLTProcessor::transformToXML does not output UTF-8 on Windows
 [2013-12-02 14:19 UTC] mike@php.net
-Summary: XSLTProcessor::transformToXML does not output UTF-8 on Windows +Summary: XSLTProcessor::transformToXML does not output UTF-8 -Status: Open +Status: Not a bug -Operating System: Windows +Operating System: any
 [2013-12-02 14:19 UTC] mike@php.net
You need a proper META content-type for any encoding to be recognized:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Dec 26 22:01:28 2024 UTC