php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #26926 DOMXML LoadHTML throws exception on URL with "&"
Submitted: 2004-01-15 15:57 UTC Modified: 2004-01-16 15:07 UTC
Votes:1
Avg. Score:4.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: vania at pandorasdream dot com Assigned:
Status: Not a bug Package: DOM XML related
PHP Version: 5.0.0b3 (beta3) OS: Windows XP
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: vania at pandorasdream dot com
New email:
PHP Version: OS:

 

 [2004-01-15 15:57 UTC] vania at pandorasdream dot com
Description:
------------
Using the example for LoadHTML the www.php.net site contains a URL with an ampersand (line 118 <a href="http://cvs.php.net/diff.php/php-src/NEWS?login=2&r1=1.1247.2.452&r2=1.1247.2.522">NEWS</a> file.).  The error says it is looking for a missing ";", apparently thinking it should be an entity.

When I remove the try/catch, no errors occur, but nothing gets printed.  Using a different URL to a file without the & works correctly.

Reproduce code:
---------------
<?php
try
{
    $dom = new domdocument;
    @$dom->loadHTMLFile("http://www.php.net/");
    $xp = new domxpath($dom);
    $result = $xp->query("/html/head/title");
    print $result[0]->firstChild->data;
}
catch (exception $exc)
{
    print $exc->getMessage()." at line ".$exc->getLine();
}

?>

Expected result:
----------------
PHP: Hypertext Preprocessor

Actual result:
--------------
<?php
try
{
    $dom = new domdocument;
    @$dom->loadHTMLFile("http://www.php.net/");
    $xp = new domxpath($dom);
    $result = $xp->query("/html/head/title");
    print $result[0]->firstChild->data;
}
catch (exception $exc)
{
    print $exc->getMessage()." at line ".$exc->getLine();
}

?>

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2004-01-15 16:02 UTC] vania at pandorasdream dot com
Sorry...  Wrong copy/paste for the actual result...

ERROR [2]domdocument::loadHTMLFile() [<a href='function.loadHTMLFile'>function.loadHTMLFile</a>]: htmlParseEntityRef: expecting ';' in http://www.php.net/, line: 118. PHP 5.0.0RC1-dev (WINNT).  at line 5
 [2004-01-15 17:20 UTC] rrichards@php.net
Please try using this CVS snapshot:

  http://snaps.php.net/php5-latest.tar.gz
 
For Windows:
 
  http://snaps.php.net/win32/php5-win32-latest.zip

If you encounter the error with a current snap, can you provide the results of:
<?php
  var_dump(stream_get_wrappers());
?>
 [2004-01-16 08:15 UTC] vania at pandorasdream dot com
Reran using PHP5 version available as of January 16, 2004, 8:05:07 AM East Coast Time, USA.  Results with the addition of the var_dump(stream_get_wrappers()); in the catch block:

<b>Exception encountered</b><br>ERROR [2]domdocument::loadHTMLFile() [<a href='function.loadHTMLFile'>function.loadHTMLFile</a>]: htmlParseEntityRef: expecting ';' in http://www.php.net/, line: 118. PHP 5.0.0RC1-dev (WINNT).  at line 5array(5) {
  [0]=>
  string(3) "php"
  [1]=>
  string(4) "file"
  [2]=>
  string(4) "http"
  [3]=>
  string(3) "ftp"
  [4]=>
  string(13) "compress.zlib"
}

Hope this helps...  Let me know if I can provide more info....
 [2004-01-16 12:31 UTC] rrichards@php.net
Can you remove the try/catch block as well as not surpress the errors. Could you email me the errors as I believe you will have quite a few.
 [2004-01-16 13:39 UTC] vania at pandorasdream dot com
<?php
    $dom = new domdocument;
    $dom->loadHTMLFile("http://www.php.net/");
    $xp = new domxpath($dom);
    $result = $xp->query("/html/head/title");
    print $result[0]->firstChild->data;
?>
//test requested without try/catch or suppression.
//Emailed to rrichards@php.net per request, but since 
//it's not too much info, pasted here as well.  Above is 
//code block, and below is the result.

Warning: domdocument::loadHTMLFile() [function.loadHTMLFile]: htmlParseEntityRef: expecting ';' in http://www.php.net/, line: 134 in e:\PHP\projectcodewiki\www\test\loadhtmlsimple.php on line 3

Warning: domdocument::loadHTMLFile() [function.loadHTMLFile]: htmlParseEntityRef: expecting ';' in http://www.php.net/, line: 134 in e:\PHP\projectcodewiki\www\test\loadhtmlsimple.php on line 3

Warning: domdocument::loadHTMLFile() [function.loadHTMLFile]: htmlParseEntityRef: no name in http://www.php.net/, line: 228 in e:\PHP\projectcodewiki\www\test\loadhtmlsimple.php on line 3

Warning: domdocument::loadHTMLFile() [function.loadHTMLFile]: Unexpected end tag : p in http://www.php.net/, line: 349 in e:\PHP\projectcodewiki\www\test\loadhtmlsimple.php on line 3

Fatal error: Cannot use object of type domnodelist as array in e:\PHP\projectcodewiki\www\test\loadhtmlsimple.php on line 6
 [2004-01-16 15:07 UTC] vania at pandorasdream dot com
XPath queries have changed from returning arrays to returning node-lists....

@ suppression of libxml is still required, which is correct behavior for PHP5.

Vania
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Wed Feb 05 19:01:31 2025 UTC