php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #66813 Detect some valid url as invalid in parse_url
Submitted: 2014-03-03 09:46 UTC Modified: 2015-05-01 17:42 UTC
Votes:2
Avg. Score:4.0 ± 1.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:0 (0.0%)
From: fzerorubigd at gmail dot com Assigned: cmb (profile)
Status: Closed Package: *General Issues
PHP Version: 5.5.9 OS: Linux x64
Private report: No CVE-ID: None
 [2014-03-03 09:46 UTC] fzerorubigd at gmail dot com
Description:
------------
This is the result of url like this : 

var_dump(parse_url("//example.com/web:o"));

==>
array(2) {
  'host' =>
  string(11) "example.com"
  'path' =>
  string(6) "/web:o"
}

but if I replace the :o with :1 the result is false 

var_dump(parse_url("//example.com/web:1"));
==> 
bool(false)

Also its correct if I use the large integer value : 

 var_dump(parse_url("//example.com/web:1232323232"));

==>
array(2) {
  'host' =>
  string(11) "example.com"
  'path' =>
  string(15) "/web:1232323232"
}



Test script:
---------------
<?php
//Correct:
var_dump(parse_url("//example.com/web:o"));
//Incorrect: 
var_dump(parse_url("//example.com/web:1"));
//Correct:
var_dump(parse_url("//example.com/web:1232323232"));


Expected result:
----------------
array(2) {
  'host' =>
  string(11) "example.com"
  'path' =>
  string(6) "/web:o"
}
array(2) {
  'host' =>
  string(11) "example.com"
  'path' =>
  string(6) "/web:1"
}
array(2) {
  'host' =>
  string(11) "example.com"
  'path' =>
  string(15) "/web:1232323232"
}



Actual result:
--------------
array(2) {
  'host' =>
  string(11) "example.com"
  'path' =>
  string(6) "/web:o"
}
bool(false)
array(2) {
  'host' =>
  string(11) "example.com"
  'path' =>
  string(15) "/web:1232323232"
}


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2014-03-06 13:19 UTC] cmbecker69 at gmx dot de
According to the documentation of parse_url()[1] this is not a
bug:

| Partial URLs are also accepted, parse_url() tries its best to
| parse them correctly.

FWIW, the current behavior is caused by php_url_parse_ex()[2],
which deems the colon to mark the beginning of the port component
and rejects all respective ULRs with up to 5 digits (and nothing
else) after the colon as invalid, because the path component seems
to be missing.

Simple workaround for userland: just prepend any (valid) scheme
component and ignore the scheme element in the result:

  var_dump(parse_url("http://example.com/web:1"));

[1] <http://www.php.net/manual/en/function.parse-url.php>
[2] <http://lxr.php.net/xref/PHP_5_5/ext/standard/url.c#php_url_parse_ex>
 [2014-07-06 02:39 UTC] yohgaki@php.net
To parse it correctly, you need URL encode.

[yohgaki@dev PHP-5.5]$ php -r 'var_dump(rawurlencode(":"));'
string(3) "%3A"

It could be coded to handle ':' in path, but chances are rare unless someone write patch and send pull request. Please note that there would be good chance that patch is rejected.
 [2015-05-01 17:42 UTC] cmb@php.net
-Status: Open +Status: Closed -Assigned To: +Assigned To: cmb
 [2015-05-01 17:42 UTC] cmb@php.net
Thank you for your bug report. This issue has already been fixed
in the latest released version of PHP, which you can download at 
http://www.php.net/downloads.php

fixed in 5.5.24 and 5.6.8
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Dec 03 17:01:29 2024 UTC