|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2008-03-08 05:09 UTC] daniel dot oconnor at gmail dot com
Description: ------------ The W3C clarified a few xml:base issues when publishing the GRDDL spec. You can see the tests at http://www.w3.org/TR/grddl-tests/#ambiguous-infoset. Basically: * DOMDocument::loadXML does not detect xml:base attributes * simplexml_load_file does not detect xml:base attributes (or they are lost during the importNode phase) * simplexml_load_string does not detect xml:base attributes (or they are lost during the importNode phase) * DOMDocument does not deal with nested xml:base * DOMDocument does not deal with redirected xml:base locations To clarify on the redirect-xml:base stuff... If I request http://foo.com/example.xml and that redirects me to http://bar.com/example.xml and bar.com/example.xml said xml:base = http://foo.com/example.xml ... then http://bar.com/example.xml's baseURI should be http://bar.com/example.xml Reproduce code: --------------- <?php $url = 'http://www.w3.org/2001/sw/grddl-wg/td/base/xmlWithBase.xml'; $xml = file_get_contents($url); //Load a url $doc = DOMDocument::load($url); var_dump($doc->baseURI); //Expected http://www.w3.org/2001/sw/grddl-wg/td/base/xmlWithBase.xml //Load an xml document with xml:base $doc = DOMDocument::loadXML($xml); var_dump($doc->baseURI); //Expected http://www.w3.org/2001/sw/grddl-wg/td/base/xmlWithBase.xml //Does it work with importNode? $sxe = simplexml_load_file($url); $dom_sxe = dom_import_simplexml($sxe); $dom = new DOMDocument('1.0'); $dom_sxe = $dom->importNode($dom_sxe, true); $dom_sxe = $dom->appendChild($dom_sxe); var_dump($doc->baseURI); //Expected (maybe) http://www.w3.org/2001/sw/grddl-wg/td/base/xmlWithBase.xml // Alternative? $sxe = simplexml_load_string($xml); $dom_sxe = dom_import_simplexml($sxe); $dom = new DOMDocument('1.0'); $dom_sxe = $dom->importNode($dom_sxe, true); $dom_sxe = $dom->appendChild($dom_sxe); var_dump($doc->baseURI); //Expected (maybe) http://www.w3.org/2001/sw/grddl-wg/td/base/xmlWithBase.xml //What about documents with an invalid xml:base (not on the top level element)? $doc = DOMDocument::load('http://www.w3.org/2001/sw/grddl-wg/td/inline-rdf6.xml'); var_dump($doc->baseURI); //Expected http://wwww.example.org/ //What about documents with a *redirected xml:base* ? //Note: this test case is a little broken because of a W3C server change - it *should* redirect to 'http://www.w3.org/2001/sw/grddl-wg/td/base/xmlWithBase.xml' // and thus have a funky new xml:base value $doc = DOMDocument::load('http://www.w3.org/2001/sw/grddl-wg/td/xmlWithBase.xml'); var_dump($doc->baseURI); //Expected http://www.w3.org/2001/sw/grddl-wg/td/base/xmlWithBase.xml Expected result: ---------------- See reproduce code Actual result: -------------- See reproduce code PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Sun Nov 02 15:00:01 2025 UTC |
A test case which illustrates that the baseURI parsing is working correctly now (at least in PHP 5.3.15): <?php $doc = DOMDocument::load('http://www.w3.org/2001/sw/grddl-wg/td/inline-rdf6.xml'); var_dump($doc->baseURI); // "http://www.w3.org/2001/sw/grddl-wg/td/inline-rdf6.xml" var_dump($doc->documentElement->baseURI); // "http://wwww.example.org/" As http://www.w3.org/TR/xmlbase/ describes, the base URI of a document entity is the URI used to retrieve the document entity. The base URI of an element (including the document element) is detected by various rules, starting with the xml:base attribute on the element.