php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #60416 SimpleXML parse fails because of an URL, no error messages
Submitted: 2011-11-30 12:31 UTC Modified: 2015-06-26 17:11 UTC
Votes:7
Avg. Score:4.9 ± 0.3
Reproduced:5 of 5 (100.0%)
Same Version:3 (60.0%)
Same OS:1 (20.0%)
From: stigh at funcom dot com Assigned: cmb (profile)
Status: Not a bug Package: SimpleXML related
PHP Version: 5.3.8 OS: CentOS 5
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: stigh at funcom dot com
New email:
PHP Version: OS:

 

 [2011-11-30 12:31 UTC] stigh at funcom dot com
Description:
------------
 - SimpleXML - Revision: 314376
 - libxml2 - version: 2.6.26

XML document generated using Word 2007, saving as regular Word file (doxc), then extracting the "word/document.xml" file from the compressed docx file (open w/7-zip, WinZIP, WinRAR, etc.).

I've been developing a docx parser in PHP and have encountered a strange bug. SimpleXML fails to parse the XML due to a URL in the XML document. I get no errors even with libxml_use_internal_errors(true) and libxml_get_errors(). the same documents parses perfectly fine with DOMDocument and also the W3C validator.

I played around with the XML, trying to add/remove elements causing the parse error, and it turned out to be a URL in the <w:document> node.

It is the URL in this line in the w:document tag: xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"

Remove the URL, keeping xmlns:w="" makes the parse successful.

Test script:
---------------
Script for reproduction:
http://nerdvar.com/stigma/php_src/simplexml_bug_example.phps

The entire document.xml file:
http://nerdvar.com/stigma/php_src/document.xml

Expected result:
----------------
SimpleXMLElement Object ( [body] => SimpleXMLElement Object ( [0] => ) ) 

Actual result:
--------------
Error



Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2015-06-26 17:11 UTC] cmb@php.net
-Status: Open +Status: Not a bug -Assigned To: +Assigned To: cmb
 [2015-06-26 17:11 UTC] cmb@php.net
Sorry, but your problem does not imply a bug in PHP itself.  For a
list of more appropriate places to ask for help using PHP, please
visit http://www.php.net/support.php as this bug system is not the
appropriate forum for asking support questions.  Due to the volume
of reports we can not explain in detail here why your report is not
a bug.  The support channels will be able to provide an explanation
for you.

Thank you for your interest in PHP.

See <http://php.net/manual/en/simplexmlelement.children.php#example-5935>.
 
PHP Copyright © 2001-2022 The PHP Group
All rights reserved.
Last updated: Fri Dec 09 02:05:54 2022 UTC