php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #46496 [PATCH] wddx_serialize_value() treats input as ISO-8859-1
Submitted: 2008-11-05 17:30 UTC Modified: 2008-11-19 23:52 UTC
From: mark at hell dot ne dot jp Assigned:
Status: Closed Package: WDDX related
PHP Version: 5.2.6 OS: Linux
Private report: No CVE-ID: None
 [2008-11-05 17:30 UTC] mark at hell dot ne dot jp
Description:
------------
As written on the page on :

http://fr.php.net/manual/en/ref.wddx.php

wddx_serialize_value() always treats input as ISO-8859-1.

This behaviour has changed in PHP 5.2.5, and has caused a few bugs on our side (and it seems we are not the only ones).

For now the workaround is to use utf8_decode() on the resulting XML string.

Reproduce code:
---------------
<?php
header("Content-Type: text/xml;encoding=utf-8");
echo wddx_serialize_value("&#50504;&#45397; &#54616;&#49464;&#50836;");
?>

Expected result:
----------------
<wddxPacket version='1.0'><header/><data><string>&#50504;&#45397; &#54616;&#49464;&#50836;</string></data></wddxPacket>


Actual result:
--------------
<wddxPacket version='1.0'><header/><data><string>?&#149;&#136;?&#133;&#149; ?&#149;&#152;?&#132;??&#154;&#148;</string></data></wddxPacket>


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-11-06 01:14 UTC] jani@php.net
http://fi.php.net/manual/en/wddx.examples.php

"Note: If you want to serialize non-ASCII characters you have to 
convert your data to UTF-8 first"
 [2008-11-06 04:36 UTC] mark at hell dot ne dot jp
Hello,

I'd like this bug to be reopened.

The &#xxx; are due to the PHP's bugtracker unability to display unicode characters. My report was initially written with korean string.

Here's a screenshot of an UTF-8 terminal with the same test:

http://beta.magicaltux.net/php5_bug_utf8_terminal.png
 [2008-11-06 04:54 UTC] mark at hell dot ne dot jp
Tested and reproduced with PHP 5.2.7rc2
 [2008-11-06 05:00 UTC] mark at hell dot ne dot jp
Here's a patch against PHP 5.2.7rc2 to fix this issue.

The real problem is about WDDX always considering input is ISO-8859-1. This is not consistent with PHP <5.2.5, not consistent with the XML api, and not consistent with the documentation.

http://ookoo.org/svn/snip/php-5.2.7rc2_wddx_utf8_resolved.patch
 [2008-11-07 05:43 UTC] mark at hell dot ne dot jp
Updated version of the patch, with a test fixed (test for bug #37569 depended on wddx_* functions accepting ISO-8859-1).

http://ookoo.org/svn/snip/php-5.2.7rc3_wddx_utf8_resolved.patch

compile runs nicely, make test does not report anything wrong, and the test for bug #37569 now also checks for this bug (ie. it checks that wddx_* functions indeed handle utf-8 as expected).
 [2008-11-19 17:04 UTC] pajoye@php.net
Reopen and add a note about the patch being committed to 5.2 (reviewed by Andrei). MFB will follow shortly.
 [2008-11-19 23:52 UTC] iliaa@php.net
This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.


 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 08:01:29 2024 UTC