php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #54759 parse_url function turns a tab character into an underscore
Submitted: 2011-05-16 22:07 UTC Modified: 2011-05-16 23:13 UTC
From: alexander_leroux at mcafee dot com Assigned:
Status: Not a bug Package: HTTP related
PHP Version: 5.3.6 OS: Windows Server 2008 R2
Private report: No CVE-ID: None
 [2011-05-16 22:07 UTC] alexander_leroux at mcafee dot com
Description:
------------
The parse_url function turns the tab character into an underscore.  This turns an invalid URL into a valid one.  If you are attempting to validate user input with parse_url then it will pass validation even though tab is an invalid character in a host name or FQDN.  It may be difficult to see in the example below but there is a tab.

Test script:
---------------
<?php
$host = "http://test	url";
var_dump(parse_url($host));
?>

Expected result:
----------------
array(2) { ["scheme"]=> string(4) "http" ["host"]=> string(8) "test	url" } 
or 
array(2) { ["scheme"]=> string(4) "http" ["host"]=> string(8) "test\turl" } 

Actual result:
--------------
array(2) { ["scheme"]=> string(4) "http" ["host"]=> string(8) "test_url" } 

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2011-05-16 22:24 UTC] dtajchreber@php.net
-Status: Open +Status: Bogus
 [2011-05-16 22:25 UTC] dtajchreber@php.net
php.net/parse_url explains all of this:

"This function is not meant to validate the given URL, it only breaks it up into 
the above listed parts. Partial URLs are also accepted, 
parse_url() tries its best to parse them correctly."

"The URL to parse. Invalid characters are replaced by _."

filter_var($var, FILTER_VALIDATE_URL) should do what you're looking for. 

[1] php.net/filter_var
[2] http://codepad.viper-7.com/M2LQor
 [2011-05-16 23:13 UTC] alexander_leroux at mcafee dot com
filter_var($var, FILTER_VALIDATE_URL) doesn't handle IPv6 addresses so I can't use 
it.  I've gotten around the problem.  I just didn't realize _ was replacing 
invalid characters.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 12:01:31 2024 UTC