|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
[2007-07-11 13:08 UTC] jani@php.net
[2007-07-16 08:40 UTC] grzegorz dot nosek at netart dot pl
[2008-07-17 00:55 UTC] jani@php.net
[2008-07-24 01:00 UTC] php-bugs at lists dot php dot net
|
|||||||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Fri Oct 24 06:00:01 2025 UTC |
Description: ------------ wddx gets confused if you try to serialize a string with utf8 characters (like the one below, it contains 'z with dot above', 'o acute', 'l stroke' and 'w' - in case it gets messed up somehow). serialized string will contain <char code='C5'/><char code='BC'/> ... etc, which will get fed into xml_utf8_decode byte by byte (after decoding the hex value), totally wrecking the output. Hackish patch which fixes this (whitespace mangled mercilessly, adjust to taste): --- a/ext/wddx/wddx.c +++ b/ext/wddx/wddx.c @@ -1038,8 +1038,13 @@ static void php_wddx_process_data(void * if (!wddx_stack_is_empty(stack) && !stack->done) { wddx_stack_top(stack, (void**)&ent); switch (Z_TYPE_P(ent)) { - case ST_STRING: - decoded = xml_utf8_decode(s, len, &decoded_len, "ISO-8859-1"); + case ST_STRING: + if (len > 1) { + decoded = xml_utf8_decode(s, len, &decoded_len, "ISO-8859-1"); + } else { + decoded = estrndup(s, len); + decoded_len = len; + } if (Z_STRLEN_P(ent->data) == 0) { Z_STRVAL_P(ent->data) = estrndup(decoded, decoded_len); Reproduce code: --------------- --TEST-- wddx_deserialize mangles utf8 characters --SKIPIF-- <?php if (!extension_loaded("wddx")) print "skip"; ?> --FILE-- <?php $zolw = iconv("ISO-8859-2", "UTF-8", "ż?łw"); $in = array ( $zolw => $zolw ); var_dump(array_diff($in, wddx_deserialize(wddx_serialize_value($in)))); ?> --EXPECT-- array(0) { }