php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #79642 xml_parse() fails with XML_ERROR_NO_MEMORY with a mere 17M file
Submitted: 2020-05-27 13:05 UTC Modified: 2023-09-23 18:06 UTC
Votes:3
Avg. Score:3.7 ± 1.2
Reproduced:3 of 3 (100.0%)
Same Version:1 (33.3%)
Same OS:2 (66.7%)
From: php4fan at gmail dot com Assigned:
Status: Duplicate Package: *XML functions
PHP Version: 7.2.31 OS: debian
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please !
Your email address:
MUST BE VALID
Solve the problem:
16 + 37 = ?
Subscribe to this entry?
Further comment on this bug is unnecessary.
 
 [2020-05-27 13:05 UTC] php4fan at gmail dot com
Description:
------------
I'm trying to parse an XML file that is about 17 MegaBytes in size with xml_parse(), and I get the XML_ERROR_NO_MEMORY error.

Even increasing the memory limit to 8 GB.

I can understand a little bit of overhead, but there's no way using several GB of memory, to decode an XML that is just 17MB in its string form, can be justified.


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-05-27 14:07 UTC] sjon@php.net
Can you share the xml file and the exact code you are using? I've tested the (chunked) example code on https://www.php.net/xml_parse using a 112M XML file and have no problems running it with a 128M memory_limit

[1] http://www.ins.cwi.nl/projects/xmark/Assets/standard.gz
 [2020-05-27 14:10 UTC] sjon@php.net
-Status: Open +Status: Feedback
 [2020-05-27 14:50 UTC] cmb@php.net
The problem might be that XML_PARSE_HUGE is not set;
unfortunately, it is currently not possible to pass that option
via xml_parser_set_option() or otherwise.

Anyway, it may be better to parse in chunks.
 [2020-05-27 15:08 UTC] php4fan at gmail dot com
-Status: Feedback +Status: Open
 [2020-05-27 15:08 UTC] php4fan at gmail dot com
> The problem might be that XML_PARSE_HUGE is not set;

Yep, found out that's the problem.

Been known for like three years at least, right?

> unfortunately, it is currently not possible to pass that option
> via xml_parser_set_option() or otherwise.

Quite a disgrace.


Yep, I worked around it by parsing in chunks. (why can't parse_xml() do it internally, by the way?)
 [2020-05-27 16:51 UTC] cmb@php.net
> why can't parse_xml() do it internally, by the way?

Well, it could, but you'd still have to provide the full XML as
string, what might be wasteful.  In my opinion, xml_parse_stream()
would be nice to have, but on the other hand, that can easily be
written in userland as well.
 [2022-01-21 14:11 UTC] requinix@php.net
-Block user comment: No +Block user comment: Yes
 [2022-01-22 15:30 UTC] php4fan at gmail dot com
What is "block user comments" supposed to accomplish? I see the spammers are spamming anyway.
 [2022-01-22 16:18 UTC] requinix@php.net
-Block user comment: No +Block user comment: Yes
 [2022-01-22 16:18 UTC] requinix@php.net
I turned it back off a bit later but the history doesn't show that. Going to leave it on this time.
 [2023-09-23 18:06 UTC] nielsdos@php.net
-Status: Open +Status: Duplicate
 [2023-09-23 18:06 UTC] nielsdos@php.net
Duplicate of #68325, and will be addressed by https://wiki.php.net/rfc/xml_option_parse_huge
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Wed Apr 24 08:01:29 2024 UTC