php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #37289 wddx_deserialize fail with iso-8859-1
Submitted: 2006-05-03 11:58 UTC Modified: 2006-05-03 13:23 UTC
From: philippe dot louys at arc-intl dot com Assigned:
Status: Not a bug Package: XML related
PHP Version: 5.1.3 OS: HP-UX 11
Private report: No CVE-ID: None
 [2006-05-03 11:58 UTC] philippe dot louys at arc-intl dot com
Description:
------------
wddx_deserialize returns NULL when iso-8859-1 encoding is used and packet contains a non-ascii character (ex : ?).

PHP5 is built with libxml.
With PHP4 and EXPAT all is right.

I think the problem come from libxml.

I wrote a script wich call the parser directly and saw that
the callback data function is called once in PHP4 with the entire string, and twice in PHP5.
The first call is done with the first part of the string, until the first accented character (excluded).
Second call with the remainder of the string.

Ex : aaaabbbb????ccccddd

PHP4 : data callback function called with aaaabbbb????ccccddd

PHP5 : data callback function called with
aaaabbbb
then a second time with
????ccccddd



Reproduce code:
---------------
<?php
$data = "<" . "?" . "xml version=\"1.0\" encoding=\"ISO-8859-1\"" . "?" . ">\n";
$data .= "<data><struct><var name=\"screen\"><string>aaabbb????cccdddeee</string></var></struct></data>";

$depth = array();
function beginElement($parser, $name, $attrs)
{
    global $depth;
    $depth[(int) $parser]++;
}
function endElement($parser, $name)
{
    global $depth;
    $depth[(int) $parser]--;
}
function getDatas($parser, $data)
{
	echo $data . "<br>";
}
$xml_parser = xml_parser_create("ISO-8859-1");
xml_set_element_handler($xml_parser, "beginElement", "endElement");
xml_set_character_data_handler($xml_parser, "getDatas");

    if (!xml_parse($xml_parser, $data, TRUE)) {
        die(sprintf("XML error : %s at line %d",
                    xml_error_string(xml_get_error_code($xml_parser)),
                    xml_get_current_line_number($xml_parser)));
    }  
xml_parser_free($xml_parser);
?>

Expected result:
----------------
aaaabbbb????ccccddd

Actual result:
--------------
aaaabbbb
????ccccddd


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-05-03 13:22 UTC] iliaa@php.net
It could be a problem with your libxml since when I've tested 
your wddx string the deserialize function parsed it properly 
creating an erray with key screen and value of 
aaabbb????cccdddeee
 [2006-05-03 13:23 UTC] chregu@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

Please read the docs. text nodes are nod guaranteed to come in 
one callback.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 19 06:01:29 2024 UTC