php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #45314 wddx_serialize_value() does not handle unicode properly
Submitted: 2008-06-19 12:48 UTC Modified: 2008-11-20 04:55 UTC
From: mikx at mikx dot de Assigned:
Status: Not a bug Package: WDDX related
PHP Version: 5.2.6 OS: *
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: mikx at mikx dot de
New email:
PHP Version: OS:

 

 [2008-06-19 12:48 UTC] mikx at mikx dot de
Description:
------------
wddx_serialize_value does not deal as expected with unicode characters in PHP 5.2.x or PHP 5.3.x. The function worked differently in PHP 5.1.6.

When serializing a string or a more complex objects any unicode characters get again utf8 encoded - instead of getting in as is.


Reproduce code:
---------------
<a href="?utf8=%E9%A1%B5">Demo</a> (some chinese character)
<hr>
<form method="get" action="?">
<input type="text" name="utf8">
<input type="submit">
</form>
<hr>
<?php 
if (isset($_GET["utf8"])) {
    echo $_GET["utf8"]."<br>"; 
    echo utf8_encode($_GET["utf8"])."<br>"; 
    echo wddx_serialize_value($_GET["utf8"])."<br>";
}    
?>

Expected result:
----------------
The demo code is a little script that outputs the given query parameter "utf8" in three way:

1. Directly as recieved
2. utf8_encoded
3. serialized via wddx_serialize_value

In <= 5.1.6 the resulting WDDX contained the utf8 characters excatly as given. In >= 5.2.0 the string gets UTF8 encoded again, just as if you would have valled utf8_encode on it.



Actual result:
--------------
While the new behavior might make sense for data going forward (although i am not sure what the expected behavior by WDDX spec is) this breaks backward compatibility with old data.

As we have millons of database rows in unicode WDDX data this is a huge issue (at least to us).

Can you please clarify if this is a bug, the expected behavior going forward and how to deal with backward compatibility issues (maybe an additional parameter to control the behavior)?

This might be related to bug #41722

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-07-22 22:50 UTC] jani@php.net
I guess (!) the fix for bug #37571 caused this problem.
 [2008-11-16 10:20 UTC] mark at hell dot ne dot jp
This bug is related to bug #46496.

Bug #37571 indeed seems to be at the origin of the problem.
 [2008-11-20 04:55 UTC] magicaltux@php.net
This problem has been resolved with bug #46496 in CVS, and will not appear in next versions of PHP.

Thanks for your interest in PHP.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Apr 16 23:01:30 2024 UTC