|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #54735 xml_parse_into_struct gives uncomplete array
Submitted: 2011-05-14 13:29 UTC Modified: 2019-08-05 06:34 UTC
Avg. Score:4.3 ± 0.9
Reproduced:2 of 2 (100.0%)
Same Version:2 (100.0%)
Same OS:0 (0.0%)
From: phpnet at phoen dot de Assigned: jbnahan (profile)
Status: Closed Package: *XML functions
PHP Version: 5.3.6 OS: Windows XPSP2
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Bug Type:
From: phpnet at phoen dot de
New email:
PHP Version: OS:


 [2011-05-14 13:29 UTC] phpnet at phoen dot de

configuration: Windows XP with XAMPP 1.7.3 (nothing special)

xml_parse_into_struct returns an uncomplete array if there is an unexpected sign(maybe utf8).
All after "€" in the array is missing. No error or messed up character shows up.
It doesn't help to set any options.


and afterwards

can fix the problem surprisingly.


Test script:
$myxml="<XML><FELD1>blubb</FELD1><FELD2>dies hier sonderzeichen €</FELD2><FELD3>feld3</FELD3></XML>";
$p = xml_parser_create();
xml_parse_into_struct($p, $myxml, $vals, $index);


Expected result:
sth like this

Array ( [0] =>
Array ( [tag] => XML [type] => open [level] => 1 [value] => ) [1] => Array ( [tag] => FELD1 [type] => complete [level] => 2 [value] => blubb ) [2] => Array ( [tag] => XML [value] => [type] => cdata [level] => 1 ) [3] => Array ( [tag] => FELD2 [type] => complete [level] => 2 [value] => dies hier sonderzeichen €) [3] => Array ( [tag] => XML [value] => [type] => cdata [level] => 1 ) [3] => Array ( [tag] => FELD3 [type] => complete[level] => 2 [value] => feld3))

Actual result:
Array ( [0] =>
Array ( [tag] => XML [type] => open [level] => 1 [value] => ) [1] => Array ( [tag] => FELD1 [type] => complete [level] => 2 [value] => blubb ) [2] => Array ( [tag] => XML [value] => [type] => cdata [level] => 1 ) [3] => Array ( [tag] => FELD2 [type] => open [level] => 2 [value] => dies hier sonderzeichen ) ) 


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2012-04-26 20:32 UTC] peter dot e dot lind at gmail dot com
Doesn't seem to be a bug on Linux (PHP 5.3.3-7+squeeze8 with Suhosin-Patch 
(cli)) - when I save the test-script in utf-8 the output is as expected. 
However, if I save it as ISO-8859-15, the xml parsing stops, as one would 
expect, when it hits the illegal character - an ISO-8859-15 € would be illegal 
in utf-8 and xml parsers must stop when they reach illegal content.

The docs could be improved though, as
parser-create.php specifies that "The supported encodings are ISO-8859-1, UTF-8 
and US-ASCII." but in the context it appears to only be the case for output. 
However, the behaviour suggests those three character sets also restrict input. 
Some comments in the php code also suggest that this is the case.
 [2015-05-14 15:24 UTC]
-Type: Bug +Type: Documentation Problem -Assigned To: +Assigned To: cmb
 [2015-05-14 15:24 UTC]
Indeed, Peter, this is not a bug in the XML parser, but rather
invalid characters have been passed in. That could have been
detected by calling xml_error_string().

I agree that the docs need improvement, even though I'm not sure
yet what is actually supported by the XML parser.
 [2015-05-14 17:10 UTC]
-Status: Assigned +Status: Analyzed -Assigned To: cmb +Assigned To:
 [2015-05-14 17:10 UTC]
Setting the *input* encoding to 'ISO-8859-1' in
xml_parser_create() is already ignored as of PHP 5.0.0[3], see
<>. The input encoding is solely specified by
the encoding attribute of the document's XML declaration (which
defaults to UTF-8 if omitted), see <>. The
$encoding parameter of xml_parser_create() specifies only the
output encoding.

The behavior when expat is used as back-end might be different.
 [2019-08-05 06:34 UTC]
-Status: Analyzed +Status: Closed -Assigned To: +Assigned To: jbnahan
 [2019-08-05 06:34 UTC]
The documentation has been updated
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Thu Sep 24 00:01:24 2020 UTC