|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2017-07-25 21:09 UTC] paul at sparrowhawkcomputing dot com
Description:
------------
Given the following XML document in test.xml:
<?xml version="1.0"?>
<root xml:space='foo'/>
The script in the "Test Script" field below reports that the instance is loaded successfully while simultaneously reporting well-formedness errors.
How can this instance be successfully loaded while there are well-formedness errors reported?
Test script:
---------------
libxml_use_internal_errors( true );
$dom = new DOMDocument();
libxml_clear_errors();
$success = $dom->load( __DIR__ . '/test.xml' );
$xml = $dom->saveXML();
$errs = libxml_get_errors();
var_dump( $success );
var_dump( $errs );
var_dump( $xml );
Expected result:
----------------
Either:
$success == false && ! empty( $errs ) && $xml === '<?xml version="1.0"?>'
or
$success == true && empty( $errs ) && $xml === '<?xml version="1.0"?>
<root/>
'
That is, if libxml_get_errors() is going to return errors then DOMDocument::load() should return false. If DOMDocument::load() is going to succeed, then @xml:space should be ignored and libxml_clear_errors() should be called internally before DOMDocument::load() returns.
Either alternative conforms to the XML spec, which says [1]:
This specification does not give meaning to any value of xml:space other
than "default" and "preserve". It is an error for other values to be
specified; the XML processor may report the error or may recover by ignoring
the attribute specification or by reporting the (erroneous) value to the
application. Applications may ignore or reject erroneous values.
The status quo does not conform to the XML spec because it both reports the error and fails to ignore the @xml:space attribute.
I VERY MUCH prefer the first alternative, as it is consistent with XMLReader which correctly reports the well-formedness error and refuses to parse test.xml.
[1] https://www.w3.org/TR/REC-xml/#sec-white-space
Actual result:
--------------
bool(true)
array(1) {
[0]=>
object(LibXMLError)#260 (6) {
["level"]=>
int(1)
["code"]=>
int(102)
["column"]=>
int(16)
["message"]=>
string(69) "Invalid value "foo" for xml:space : "default" or "preserve" expected
"
["file"]=>
string(87) "file://test.xml"
["line"]=>
int(2)
}
}
string(80) "<?xml version="1.0"?>
<root xml:space="foo"/>
"
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Sat Nov 08 06:00:01 2025 UTC |
Wow! I had no idea the PHP sources were up on github. Thanx. That's very helpful in figuring out whether something is a PHP bug or a libxml bug. Thanx for the pointer. I just found another case that muddies the waters with regards to what libxml considers a well-formedness error. libxml_use_internal_errors( true ); libxml_clear_errors(); $dom = new DOMDocument(); $dom->loadXML( '<root xmlns:xml="urn:foo"/>' ); $errs = libxml_get_errors(); var_dump( $errs ); produces: array(1) { [0]=> object(LibXMLError)#260 (6) { ["level"]=> int(2) ["code"]=> int(200) ["column"]=> int(16) ["message"]=> string(41) "xml namespace prefix mapped to wrong URI " ["file"]=> string(0) "" ["line"]=> int(1) } } So, libxml doesn't consider that a "fatal" error even tho that instance is unambiguously not well-formed [1]. So, please leave this ticket open while I dig more into exactly what libxml does and does not consider to be a well-formedness error. [1] https://www.w3.org/TR/REC-xml-names/#xmlReserved