php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #55374 DOMDocument::LoadHTMLFile fails with %xx sequences in filename.
Submitted: 2011-08-06 06:37 UTC Modified: 2013-12-02 16:34 UTC
Votes:5
Avg. Score:4.4 ± 0.8
Reproduced:4 of 4 (100.0%)
Same Version:1 (25.0%)
Same OS:2 (50.0%)
From: keithm at aoeex dot com Assigned:
Status: Open Package: DOM XML related
PHP Version: 5.4.0alpha3 OS: Linux
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2011-08-06 06:37 UTC] keithm at aoeex dot com
Description:
------------
DOMDocument::LoadHTMLFile appears to urldecode it's argument, which causes 
problems when attempting to load a file containing a %xx sequence.

This issue was brought up on ##php in freenode when someone was attempting to load 
a file named 'Linux_Files%2Fetc%2Fbash.bashrc.html'.  Suggested work around was to 
use LoadHTML + file_get_contents instead.

There was a small debate over whether this is a bug, or just a documentation 
problem (perhaps LoadHTMLFile expects a URL).

DOMDocument::Load() is also affected.

Test script:
---------------
Contents of 'Linux_Files%2Fetc%2Fbash.bashrc.html'

---------------------------------------8<---------------------------------------
<html>
 <head>
  <title></title>
 </head>
 <body>
 </body>
</html>
---------------------------------------8<---------------------------------------


contents of 'test.php'
---------------------------------------8<---------------------------------------
<?php

$file = 'Linux_Files%2Fetc%2Fbash.bashrc.html';

$doc = new DOMDocument();
$doc->loadHTMLFile($file);
var_dump($doc->getElementsByTagName('body')->length);

echo str_repeat('-', 80), "\r\n";

$doc2 = new DOMDocument();
$doc2->loadHTMLFile(urlencode($file));
var_dump($doc2->getElementsByTagName('body')->length);
---------------------------------------8<---------------------------------------


Expected result:
----------------
Expect the ->loadHTMLFile($file) to succeed and the -
>loadHTMLFile(urlencode($file)) to fail with a file-not-found type error.

Actual result:
--------------
->loadHTMLFile($file) failes with errors:

PHP Warning:  DOMDocument::loadHTMLFile(): I/O warning : failed to load external 
entity "Linux_Files%2Fetc%2Fbash.bashrc.html" in /home/kicken/test.php on line 6

Warning: DOMDocument::loadHTMLFile(): I/O warning : failed to load external entity 
"Linux_Files%2Fetc%2Fbash.bashrc.html" in /home/kicken/test.php on line 6


->loadHTMLFile(urlencode($file)) succeeds.


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2013-12-02 16:34 UTC] mike@php.net
-Type: Bug +Type: Documentation Problem
 
PHP Copyright © 2001-2017 The PHP Group
All rights reserved.
Last updated: Sun Nov 19 01:31:42 2017 UTC