php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #60142 XSLTProcessor::transformToXML does not output UTF-8
Submitted: 2011-10-26 16:11 UTC Modified: 2013-12-02 14:19 UTC
Votes:4
Avg. Score:4.2 ± 0.8
Reproduced:2 of 3 (66.7%)
Same Version:2 (100.0%)
Same OS:2 (100.0%)
From: maximeraoust at gmail dot com Assigned:
Status: Not a bug Package: XSLT related
PHP Version: 5.3.8 OS: any
Private report: No CVE-ID: None
 [2011-10-26 16:11 UTC] maximeraoust at gmail dot com
Description:
------------
XSLTProcessor::transformToXML never seem to output UTF-8 data. 

Bug was already reported (https://bugs.php.net/bug.php?id=36415 & 
https://bugs.php.net/bug.php?id=36407&edit=2) but not enough information were 
provided and reports were closed. So here's attached a complete test.

I'm running PHP 5.3.8 VC9 + Apache 2.2.21 VC9 on Windows 7 Professional SP1.

I also tested on 5.3.2 under Ubuntu 10.04.3 and it works fine.



Test script:
---------------
<?php
$xslt = '<?xml version="1.0" encoding="utf-8"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"><xsl:output method="text" /></xsl:stylesheet>';
$inputText = 'UTF-8 text with accents: éèàç';

$doc = new DOMDocument('1.0', 'UTF-8');
$doc->loadHTML($inputText);

$xsl = new DOMDocument('1.0', 'UTF-8');
$xsl->loadXML($xslt);

$proc = new XSLTProcessor();
$proc->importStylesheet($xsl);

echo $proc->transformToDoc($doc)->saveXML();
?>

Expected result:
----------------
It should output (in CLI):

<?xml version="1.0"?>
UTF-8 text with accents: &#xE9;&#xE8;&#xE0;&#xE7;

Or if displayed in browser:

UTF-8 text with accents: éèàç

Actual result:
--------------
It outputs:

<?xml version="1.0"?>
UTF-8 text with accents: &#xC3;&#xA9;&#xC3;&#xA8;&#xC3;&#xA0;&#xC3;&#xA7;

Or if displayed in browser:

UTF-8 text with accents: éèà ç

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2011-10-26 16:13 UTC] maximeraoust at gmail dot com
In the test script, you can also put:

echo $proc->transformToXml(); to get the output, same result
 [2011-10-26 16:16 UTC] maximeraoust at gmail dot com
Sorry, I obviously meant:

echo $proc->transformToXml($doc);
 [2011-10-26 16:32 UTC] maximeraoust at gmail dot com
Fixed title with "Windows"
 [2011-10-26 16:32 UTC] maximeraoust at gmail dot com
-Summary: XSLTProcessor::transformToXML does not output UTF-8 +Summary: XSLTProcessor::transformToXML does not output UTF-8 on Windows
 [2013-12-02 14:19 UTC] mike@php.net
-Summary: XSLTProcessor::transformToXML does not output UTF-8 on Windows +Summary: XSLTProcessor::transformToXML does not output UTF-8 -Status: Open +Status: Not a bug -Operating System: Windows +Operating System: any
 [2013-12-02 14:19 UTC] mike@php.net
You need a proper META content-type for any encoding to be recognized:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
 
PHP Copyright © 2001-2019 The PHP Group
All rights reserved.
Last updated: Thu Dec 12 05:01:24 2019 UTC