|   | php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login | 
| 
 PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits              [2008-07-13 15:13 UTC] jani@php.net
  [2008-07-21 01:00 UTC] php-bugs at lists dot php dot net
 | |||||||||||||||||||||||||||
|  Copyright © 2001-2025 The PHP Group All rights reserved. | Last updated: Sat Oct 25 03:00:01 2025 UTC | 
Description: ------------ wddx gets confused if you try to serialize a string with utf8 characters (like the one below, it contains 'z with dot above', 'o acute', 'l stroke' and 'w' - in case it gets messed up somehow). serialized string will contain <char code='C5'/><char code='BC'/> ... etc, which will get fed into xml_utf8_decode byte by byte (after decoding the hex value), totally wrecking the output. Up to this point, it's a duplicate of #38900. However, PHP5 has another bug with variable names (e.g. hash keys) containing UTF8 characters. It seems that the var name is converted down from UTF8 to ISO-8859-1, yielding question marks instead of characters outside latin1. Another hackish patch (whitespace-mutilated): --- a/ext/wddx/wddx.c +++ b/ext/wddx/wddx.c @@ -814,10 +814,7 @@ static void php_wddx_push_element(void * if (atts) for (i = 0; atts[i]; i++) { if (!strcmp(atts[i], EL_NAME) && atts[++i] && atts[i][0]) { - char *decoded; - int decoded_len; - decoded = xml_utf8_decode(atts[i], strlen(atts[i]), &decoded_len, "ISO-8859-1"); - stack->varname = decoded; + stack->varname = estrndup(atts[i], strlen(atts[i])); break; } } @@ -1057,7 +1054,12 @@ static void php_wddx_process_data(void * wddx_stack_top(stack, (void**)&ent); switch (Z_TYPE_P(ent)) { case ST_STRING: - decoded = xml_utf8_decode(s, len, &decoded_len, "ISO-8859-1"); + if (len > 1) { + decoded = xml_utf8_decode(s, len, &decoded_len, "ISO-8859-1"); + } else { + decoded = estrndup(s, len); + decoded_len = len; + } Reproduce code: --------------- See http://bugs.php.net/bug.php?id=38900