|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #40762 xml_parser failed to parse mixed coding file
Submitted: 2007-03-08 21:56 UTC Modified: 2007-03-09 08:05 UTC
Avg. Score:5.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: forward at hongyu dot org Assigned:
Status: Not a bug Package: *XML functions
PHP Version: 5.2.1 OS: Linux and Windows
Private report: No CVE-ID: None
 [2007-03-08 21:56 UTC] forward at hongyu dot org
My RSS parser failed after I upgrade the PHP version on my server from 4.x to 5.2. When I debugged the code, I found the error was caused by the xml_parse() function's failure to parse the UTF-8 encoded RSS message, which is originally converted from a GB18030 string. 

The error message looks like: 
"Warning: xml_parse() [function.xml-parse]: input conversion failed due to input error, bytes 0x9B 0xE6 ..."

The orginal GB encoded string consists of Chinese characters, but I converted it to UTF-8 coding using function iconv(). I can view the converted string correctly on web browsers, which means that there is no converting error. So the failure only comes from xml_parse() function, I believe.

For your testing purpose, an example of the original GB18030 string can be downloaded at


Reproduce code:
// variable $gb contains the GB encoded string, e.g., from
// web address

// variable $utf contains the UTF-8 string converted from 
// the original GB encoded string

        $urf = iconv('GB18030','UTF-8', $gb);

// function feed_start_end and feed_end_element etc. are from
// the package Magpierss

        xml_set_object( $parser, $this );
                'feed_start_element', 'feed_end_element' );
        xml_set_character_data_handler( $parser, 'feed_cdata' ); 
        $status = xml_parse( $parser, $utf );

Expected result:
No error message.

Actual result:
Warning: xml_parse() [function.xml-parse]: input conversion failed due to input error, bytes 0x9B 0xE6 ...


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2007-03-09 06:11 UTC]
You have to change this line in the XML, too

<?xml version="1.0" encoding="gb2312" ?>

 [2007-03-09 08:05 UTC] forward at hongyu dot org
Exactly what you said. Thanks a lot!
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Apr 23 21:01:31 2024 UTC