|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
[2004-11-05 15:34 UTC] derick@php.net
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Sun Nov 02 17:00:02 2025 UTC |
Description: ------------ When converting my pages to PHP5 SAX XML parser, they broke because of an appearant incompatability. The chardata-handler is called in a different pattern that in PHP4. Before, it seemed to be called once per character block. Now, the buffer is flushed before each block of high-bit characters, it seems. This is unexpected and (seemingly?) impossible to change. Reproduce code: --------------- <? function es() {} function ee() {} function cd($P, $D) {print "[$D]\n";} # $str = "UTF:æøå:UTF"; $strenc = "utf-8"; $str = "ISO:???:ISO"; $strenc = "iso-8859-1"; $buffer = "<?xml version=\"1.0\" encoding=\"$strenc\"?><global>$str</global>"; $xml_parser = xml_parser_create(); # xml_set_element_handler($xml_parser, "es", "ee"); xml_set_character_data_handler($xml_parser, "cd"); xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, true); xml_parser_set_option($xml_parser, XML_OPTION_TARGET_ENCODING, "iso-8859-1"); If (xml_parse($xml_parser, $buffer) == false) die(sprintf("TV import error: %s at line %d col %d\n%s", xml_error_string(xml_get_error_code($xml_parser)), xml_get_current_line_number($xml_parser), xml_get_current_column_number($xml_parser), $buffer)); xml_parser_free($xml_parser); ?> Expected result: ---------------- expected: [ISO:???:ISO] php4: [ISO:???:ISO] Actual result: -------------- [ISO:] [???:ISO]