php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #43347 Big5 RSS Feed crashes xml_parse() function
Submitted: 2007-11-20 14:58 UTC Modified: 2007-11-22 12:35 UTC
From: pschmandra at hotmail dot com Assigned:
Status: Not a bug Package: XML related
PHP Version: 5.1.6 OS: Red Hat Linux Enterprise 5
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: pschmandra at hotmail dot com
New email:
PHP Version: OS:

 

 [2007-11-20 14:58 UTC] pschmandra at hotmail dot com
Description:
------------
After upgrading PHP5 the xml_parse() parser throws 

Warning: xml_parse() [function.xml-parse]: input conversion failed due to input error, bytes 0xA3 0xEE 0x7D 0x5F

when parsing Chineese Traditional BBC News Feed.

newsrss.bbc.co.uk/rss/chinese/trad/taiwan_hk/rss.xml.

This function works perfectly using PHP4. Googling this error brings ~800 similar results.

Reproduce code:
---------------
if(!xml_parse($this->feedReader, $BBC_data)){
						$this->in_error = true;
						$this->error_msg = sprintf("XML Error: %s at line %d", xml_error_string(xml_get_error_code($this->feedReader)), xml_get_current_line_number($this->feedReader));
					}

Expected result:
----------------
Expect xml_parse() function to parse feed without crashing ala PHP4.

Actual result:
--------------
xml_parse() function crashes throwing

Warning: xml_parse() [function.xml-parse]: input conversion failed due to input error, bytes 0xA3 0xEE 0x7D 0x5F

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2007-11-20 16:49 UTC] jani@php.net
What was the full configure line used to configure PHP in this case? 
And I don't see any crash there, just a normal error for passing data with encoding which isn't supported by the xml library in use..
 [2007-11-20 17:56 UTC] chregu@php.net
Please also read http://php.net/manual/en/function.xml-parser-create.php
carefully, especially:
***
The optional encoding specifies the character encoding for the 
input/output in PHP 4. Starting from PHP 5, the input encoding is 
automatically detected, so that the encoding parameter specifies only 
the output encoding. In PHP 4, the default output encoding is the same 
as the input charset. If empty string is passed, the parser attempts to 
identify which encoding the document is encoded in by looking at the 
heading 3 or 4 bytes. In PHP 5.0.0 and 5.0.1, the default output charset 
is ISO-8859-1, while in PHP 5.0.2 and upper is UTF-8. The supported 
encodings are ISO-8859-1, UTF-8 and US-ASCII.
***
 [2007-11-20 18:49 UTC] pschmandra at hotmail dot com
All supported encodings listed below cause the xml_parse() function to error out parsing any Chinese Traditional RSS Feeds published by the BBC using PHP5.1.6 with a UTF-8 default_charset.

xml_parser_create()
xml_parser_create('')
xml_parser_create('UTF-8')
xml_parser_create('ISO-8859-1')
xml_parser_create('US-ASCII')
 [2007-11-21 06:00 UTC] chregu@php.net
Please show an example of your XML
 [2007-11-21 14:54 UTC] pschmandra at hotmail dot com
Sorry, every time I try to send the link to the BBC feeds or I put in XML I get "Please do not SPAM our bug system".
 [2007-11-21 18:33 UTC] pschmandra at hotmail dot com
List of Traditional Chinese RSS Feeds provided by the BBC that make xml_parse() function error out.

newsrss.bbc.co.uk/rss/chinese/trad/news/rss.xml
newsrss.bbc.co.uk/rss/chinese/trad/world/rss.xml
newsrss.bbc.co.uk/rss/chinese/trad/china_news/rss.xml
newsrss.bbc.co.uk/rss/chinese/trad/taiwan_hk/rss.xml
newsrss.bbc.co.uk/rss/chinese/trad/uk/rss.xml
newsrss.bbc.co.uk/rss/chinese/trad/learn_english/rss.xml
newsrss.bbc.co.uk/rss/chinese/trad/business/rss.xml
newsrss.bbc.co.uk/rss/chinese/trad/sci/tech/rss.xml
newsrss.bbc.co.uk/rss/chinese/trad/press/rss.xml
 [2007-11-22 12:35 UTC] rrichards@php.net
Sorry, but your problem does not imply a bug in PHP itself.  For a
list of more appropriate places to ask for help using PHP, please
visit http://www.php.net/support.php as this bug system is not the
appropriate forum for asking support questions.  Due to the volume
of reports we can not explain in detail here why your report is not
a bug.  The support channels will be able to provide an explanation
for you.

Thank you for your interest in PHP.

feeds contain invalid characters for big5. You can confirm that trying to convert it directly with iconv or even validating the feed with any online xml validator.
http://validator.w3.org/
http://feedvalidator.org/

 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Apr 25 21:01:36 2024 UTC