Bug #49660 libxml 2.7.3+ limits text nodes to 10MB
Submitted: 2009-09-24 15:37 UTC Modified: 2011-12-02 12:30 UTC
From: sta at netimage dot dk Assigned: gooh
Status: Closed Package: SOAP related
PHP Version: * OS: FreeBSD 7.1
Private report: No CVE-ID:
 [2009-09-24 15:37 UTC] sta at netimage dot dk
Since version 2.7.3 libxml limits the maximum size of a single text node to 10MB.
The limit can be removed with a new option, XML_PARSE_HUGE.
PHP has no way to specify this option to libxml.

I found the bug when making af SOAP-request where the reply contained a 20MB string.
SoapClient->__call() threw an exception: 'looks like we got no XML document'

Using libxml_use_internal_errors(true) and libxml_get_errors() I could narrow it down to a LibXMLError, code 5, 'Extra content at the end of the document' - but the specified line and column was in the middle of a large text node.

Using SoapClient->__getLastResponse() I saved the response to a file.
The xmllib program xmllint then revealed the cause:
> xmllint --noout soap_response.txt 
soap_response.txt:111834: error: xmlSAX2Characters: huge text node: out of memory

We need a way to specify the XML_PARSE_HUGE option to libxml - perhaps something like a new function: libxml_parse_huge(true).

Reproduce code:
$xml = "<?xml version='1.0' encoding='utf-8' standalone='yes' ?><test>" . str_repeat('A', 12000000) . "</test>";
file_put_contents('file.xml', $xml);
$sxe = simplexml_load_file('file.xml');
if ($sxe instanceof SimpleXMLElement) {
	echo 'OK\n';
else {

Expected result:

Actual result:
PHP Warning:  simplexml_load_file(): file.xml:1: error: xmlSAX2Characters: huge text node: out of memory in /usr/dana/data/developers/holst/mobilmap/cron/xml.php on line 5
PHP Warning:  simplexml_load_file(): AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA in /usr/dana/data/developers/holst/mobilmap/cron/xml.php on line 5
PHP Warning:  simplexml_load_file():                                                                                ^ in /usr/dana/data/developers/holst/mobilmap/cron/xml.php on line 5


 [2009-09-26 12:12 UTC]
I guess we could expose the constant value from ext/libxml if available like:
Index: libxml.c
--- libxml.c	(revision 288659)
+++ libxml.c	(working copy)
@@ -622,6 +622,9 @@
+#if LIBXML_VERSION >= 20703
 	/* Error levels */

Does this work for you when passing it to SimpleXML's $option parameter?

(Patch made against PHP_5_3, but is just a 3 line c/p to other branches)
 [2009-09-28 11:31 UTC] sta at netimage dot dk
Hi Kalle.

Thanks for replying so soon.

Your patch does fix the issue when loading directly with 
simplexml_load_file('file.xml', 'SimpleXMLElement', LIBXML_PARSEHUGE);

However the error was not originally encountered using simplexml, but using SoapClient - and there does not seem to be a way to pass libxml-options to SoapClient.

So as far as I can tell, the problem remains when using SoapClient.

 [2009-12-01 02:05 UTC]
Automatic comment from SVN on behalf of felipe
Log: - Fixed bug #49660 (libxml 2.7.3+ limits text nodes to 10MB). (Felipe)
- Added LIBXML_PARSEHUGE constant to overrides the maximum text size of a
  single text node when using libxml2.7.3+. (Kalle)
 [2009-12-01 02:06 UTC]
This bug has been fixed in SVN.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
Thank you for the report, and for helping us make PHP better.

 [2009-12-01 11:15 UTC]
"Added LIBXML_PARSEHUGE constant to overrides the maximum text size of a single text node when using libxml2.7.3+."
 [2011-12-02 12:30 UTC]
-Status: Open +Status: Closed -Assigned To: +Assigned To: gooh
