php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #26926 DOMXML LoadHTML throws exception on URL with "&"
Submitted: 2004-01-15 15:57 UTC Modified: 2004-01-16 15:07 UTC
Votes:1
Avg. Score:4.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: vania at pandorasdream dot com Assigned:
Status: Not a bug Package: DOM XML related
PHP Version: 5.0.0b3 (beta3) OS: Windows XP
Private report: No CVE-ID: None
 [2004-01-15 15:57 UTC] vania at pandorasdream dot com
Description:
------------
Using the example for LoadHTML the www.php.net site contains a URL with an ampersand (line 118 <a href="http://cvs.php.net/diff.php/php-src/NEWS?login=2&r1=1.1247.2.452&r2=1.1247.2.522">NEWS</a> file.).  The error says it is looking for a missing ";", apparently thinking it should be an entity.

When I remove the try/catch, no errors occur, but nothing gets printed.  Using a different URL to a file without the & works correctly.

Reproduce code:
---------------
<?php
try
{
    $dom = new domdocument;
    @$dom->loadHTMLFile("http://www.php.net/");
    $xp = new domxpath($dom);
    $result = $xp->query("/html/head/title");
    print $result[0]->firstChild->data;
}
catch (exception $exc)
{
    print $exc->getMessage()." at line ".$exc->getLine();
}

?>

Expected result:
----------------
PHP: Hypertext Preprocessor

Actual result:
--------------
<?php
try
{
    $dom = new domdocument;
    @$dom->loadHTMLFile("http://www.php.net/");
    $xp = new domxpath($dom);
    $result = $xp->query("/html/head/title");
    print $result[0]->firstChild->data;
}
catch (exception $exc)
{
    print $exc->getMessage()." at line ".$exc->getLine();
}

?>

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2004-01-15 16:02 UTC] vania at pandorasdream dot com
Sorry...  Wrong copy/paste for the actual result...

ERROR [2]domdocument::loadHTMLFile() [<a href='function.loadHTMLFile'>function.loadHTMLFile</a>]: htmlParseEntityRef: expecting ';' in http://www.php.net/, line: 118. PHP 5.0.0RC1-dev (WINNT).  at line 5
 [2004-01-15 17:20 UTC] rrichards@php.net
Please try using this CVS snapshot:

  http://snaps.php.net/php5-latest.tar.gz
 
For Windows:
 
  http://snaps.php.net/win32/php5-win32-latest.zip

If you encounter the error with a current snap, can you provide the results of:
<?php
  var_dump(stream_get_wrappers());
?>
 [2004-01-16 08:15 UTC] vania at pandorasdream dot com
Reran using PHP5 version available as of January 16, 2004, 8:05:07 AM East Coast Time, USA.  Results with the addition of the var_dump(stream_get_wrappers()); in the catch block:

<b>Exception encountered</b><br>ERROR [2]domdocument::loadHTMLFile() [<a href='function.loadHTMLFile'>function.loadHTMLFile</a>]: htmlParseEntityRef: expecting ';' in http://www.php.net/, line: 118. PHP 5.0.0RC1-dev (WINNT).  at line 5array(5) {
  [0]=>
  string(3) "php"
  [1]=>
  string(4) "file"
  [2]=>
  string(4) "http"
  [3]=>
  string(3) "ftp"
  [4]=>
  string(13) "compress.zlib"
}

Hope this helps...  Let me know if I can provide more info....
 [2004-01-16 12:31 UTC] rrichards@php.net
Can you remove the try/catch block as well as not surpress the errors. Could you email me the errors as I believe you will have quite a few.
 [2004-01-16 13:39 UTC] vania at pandorasdream dot com
<?php
    $dom = new domdocument;
    $dom->loadHTMLFile("http://www.php.net/");
    $xp = new domxpath($dom);
    $result = $xp->query("/html/head/title");
    print $result[0]->firstChild->data;
?>
//test requested without try/catch or suppression.
//Emailed to rrichards@php.net per request, but since 
//it's not too much info, pasted here as well.  Above is 
//code block, and below is the result.

Warning: domdocument::loadHTMLFile() [function.loadHTMLFile]: htmlParseEntityRef: expecting ';' in http://www.php.net/, line: 134 in e:\PHP\projectcodewiki\www\test\loadhtmlsimple.php on line 3

Warning: domdocument::loadHTMLFile() [function.loadHTMLFile]: htmlParseEntityRef: expecting ';' in http://www.php.net/, line: 134 in e:\PHP\projectcodewiki\www\test\loadhtmlsimple.php on line 3

Warning: domdocument::loadHTMLFile() [function.loadHTMLFile]: htmlParseEntityRef: no name in http://www.php.net/, line: 228 in e:\PHP\projectcodewiki\www\test\loadhtmlsimple.php on line 3

Warning: domdocument::loadHTMLFile() [function.loadHTMLFile]: Unexpected end tag : p in http://www.php.net/, line: 349 in e:\PHP\projectcodewiki\www\test\loadhtmlsimple.php on line 3

Fatal error: Cannot use object of type domnodelist as array in e:\PHP\projectcodewiki\www\test\loadhtmlsimple.php on line 6
 [2004-01-16 15:07 UTC] vania at pandorasdream dot com
XPath queries have changed from returning arrays to returning node-lists....

@ suppression of libxml is still required, which is correct behavior for PHP5.

Vania
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Mar 28 22:01:26 2024 UTC