php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #31139 "XML Parser Functions" seem to drop & when parsing
Submitted: 2004-12-16 23:06 UTC Modified: 2004-12-17 13:22 UTC
From: exaton at free dot fr Assigned:
Status: Closed Package: XML related
PHP Version: 5.0.3 OS: Windows XP
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: exaton at free dot fr
New email:
PHP Version: OS:

 

 [2004-12-16 23:06 UTC] exaton at free dot fr
Description:
------------
I parse my XML with the "XML parser functions" (xml_parser_create() and all that). Up to PHP 5.0.2 I had no problem, and suddenly the following has appeared in 5.0.3 :

I used to put é in an XML file to recuperate the é HTML entity in my page output and therefore an acute-accentuated e in the text. This was perfectly coherent with XML specs, and worked fine.

Now the output on the page has simply dropped the "&" that should come from & , therefore parsing é into eacute; -- which of course is perfectly unwanted.

é is of course not the only affected entity : I'm not seeing a single & in any content obtained from parsing XML.

Note : explicitely passing a charset string to xml_parser_create() did not solve this.

Reproduce code:
---------------
$this -> parser = xml_parser_create();

xml_parser_set_option($this -> parser, XML_OPTION_SKIP_WHITE, 1); 
xml_parser_set_option($this -> parser, XML_OPTION_CASE_FOLDING, 0);

// $this -> contents has the plaintext contents of an XML file, prone to contain things like é

xml_parse_into_struct($this -> parser, $this -> contents, $this -> values, $this -> index);

xml_parser_free($this -> parser);

Expected result:
----------------
When I eventually recuperate the output with my custom architecture, I expect things like é to have been parsed into é -- they always were up to and including PHP 5.0.2 .

Actual result:
--------------
In PHP 5.0.3, é for example is now parsed into eacute;

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2004-12-16 23:09 UTC] exaton at free dot fr
(gave a more explicit title)
 [2004-12-17 07:53 UTC] chregu@php.net
Not enough information was provided for us to be able
to handle this bug. Please re-read the instructions at
http://bugs.php.net/how-to-report.php

If you can provide more information, feel free to add it
to this bug and change the status back to "Open".

Thank you for your interest in PHP.


Please give an example, which is selfrunning, yours isn't.
 [2004-12-17 13:22 UTC] rrichards@php.net
This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.


 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 19 13:01:30 2024 UTC