php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #79642 xml_parse() fails with XML_ERROR_NO_MEMORY with a mere 17M file
Submitted: 2020-05-27 13:05 UTC Modified: 2023-09-23 18:06 UTC
Votes:3
Avg. Score:3.7 ± 1.2
Reproduced:3 of 3 (100.0%)
Same Version:1 (33.3%)
Same OS:2 (66.7%)
From: php4fan at gmail dot com Assigned:
Status: Duplicate Package: *XML functions
PHP Version: 7.2.31 OS: debian
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: php4fan at gmail dot com
New email:
PHP Version: OS:

Further comment on this bug is unnecessary.

 

 [2020-05-27 13:05 UTC] php4fan at gmail dot com
Description:
------------
I'm trying to parse an XML file that is about 17 MegaBytes in size with xml_parse(), and I get the XML_ERROR_NO_MEMORY error.

Even increasing the memory limit to 8 GB.

I can understand a little bit of overhead, but there's no way using several GB of memory, to decode an XML that is just 17MB in its string form, can be justified.


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-05-27 14:07 UTC] sjon@php.net
Can you share the xml file and the exact code you are using? I've tested the (chunked) example code on https://www.php.net/xml_parse using a 112M XML file and have no problems running it with a 128M memory_limit

[1] http://www.ins.cwi.nl/projects/xmark/Assets/standard.gz
 [2020-05-27 14:10 UTC] sjon@php.net
-Status: Open +Status: Feedback
 [2020-05-27 14:50 UTC] cmb@php.net
The problem might be that XML_PARSE_HUGE is not set;
unfortunately, it is currently not possible to pass that option
via xml_parser_set_option() or otherwise.

Anyway, it may be better to parse in chunks.
 [2020-05-27 15:08 UTC] php4fan at gmail dot com
-Status: Feedback +Status: Open
 [2020-05-27 15:08 UTC] php4fan at gmail dot com
> The problem might be that XML_PARSE_HUGE is not set;

Yep, found out that's the problem.

Been known for like three years at least, right?

> unfortunately, it is currently not possible to pass that option
> via xml_parser_set_option() or otherwise.

Quite a disgrace.


Yep, I worked around it by parsing in chunks. (why can't parse_xml() do it internally, by the way?)
 [2020-05-27 16:51 UTC] cmb@php.net
> why can't parse_xml() do it internally, by the way?

Well, it could, but you'd still have to provide the full XML as
string, what might be wasteful.  In my opinion, xml_parse_stream()
would be nice to have, but on the other hand, that can easily be
written in userland as well.
 [2022-01-21 14:11 UTC] requinix@php.net
-Block user comment: No +Block user comment: Yes
 [2022-01-22 15:30 UTC] php4fan at gmail dot com
What is "block user comments" supposed to accomplish? I see the spammers are spamming anyway.
 [2022-01-22 16:18 UTC] requinix@php.net
-Block user comment: No +Block user comment: Yes
 [2022-01-22 16:18 UTC] requinix@php.net
I turned it back off a bit later but the history doesn't show that. Going to leave it on this time.
 [2023-09-23 18:06 UTC] nielsdos@php.net
-Status: Open +Status: Duplicate
 [2023-09-23 18:06 UTC] nielsdos@php.net
Duplicate of #68325, and will be addressed by https://wiki.php.net/rfc/xml_option_parse_huge
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Dec 22 05:01:30 2024 UTC