php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #66813 Detect some valid url as invalid in parse_url
Submitted: 2014-03-03 09:46 UTC Modified: 2015-05-01 17:42 UTC
Votes:2
Avg. Score:4.0 ± 1.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:0 (0.0%)
From: fzerorubigd at gmail dot com Assigned: cmb (profile)
Status: Closed Package: *General Issues
PHP Version: 5.5.9 OS: Linux x64
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: fzerorubigd at gmail dot com
New email:
PHP Version: OS:

 

 [2014-03-03 09:46 UTC] fzerorubigd at gmail dot com
Description:
------------
This is the result of url like this : 

var_dump(parse_url("//example.com/web:o"));

==>
array(2) {
  'host' =>
  string(11) "example.com"
  'path' =>
  string(6) "/web:o"
}

but if I replace the :o with :1 the result is false 

var_dump(parse_url("//example.com/web:1"));
==> 
bool(false)

Also its correct if I use the large integer value : 

 var_dump(parse_url("//example.com/web:1232323232"));

==>
array(2) {
  'host' =>
  string(11) "example.com"
  'path' =>
  string(15) "/web:1232323232"
}



Test script:
---------------
<?php
//Correct:
var_dump(parse_url("//example.com/web:o"));
//Incorrect: 
var_dump(parse_url("//example.com/web:1"));
//Correct:
var_dump(parse_url("//example.com/web:1232323232"));


Expected result:
----------------
array(2) {
  'host' =>
  string(11) "example.com"
  'path' =>
  string(6) "/web:o"
}
array(2) {
  'host' =>
  string(11) "example.com"
  'path' =>
  string(6) "/web:1"
}
array(2) {
  'host' =>
  string(11) "example.com"
  'path' =>
  string(15) "/web:1232323232"
}



Actual result:
--------------
array(2) {
  'host' =>
  string(11) "example.com"
  'path' =>
  string(6) "/web:o"
}
bool(false)
array(2) {
  'host' =>
  string(11) "example.com"
  'path' =>
  string(15) "/web:1232323232"
}


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2014-03-06 13:19 UTC] cmbecker69 at gmx dot de
According to the documentation of parse_url()[1] this is not a
bug:

| Partial URLs are also accepted, parse_url() tries its best to
| parse them correctly.

FWIW, the current behavior is caused by php_url_parse_ex()[2],
which deems the colon to mark the beginning of the port component
and rejects all respective ULRs with up to 5 digits (and nothing
else) after the colon as invalid, because the path component seems
to be missing.

Simple workaround for userland: just prepend any (valid) scheme
component and ignore the scheme element in the result:

  var_dump(parse_url("http://example.com/web:1"));

[1] <http://www.php.net/manual/en/function.parse-url.php>
[2] <http://lxr.php.net/xref/PHP_5_5/ext/standard/url.c#php_url_parse_ex>
 [2014-07-06 02:39 UTC] yohgaki@php.net
To parse it correctly, you need URL encode.

[yohgaki@dev PHP-5.5]$ php -r 'var_dump(rawurlencode(":"));'
string(3) "%3A"

It could be coded to handle ':' in path, but chances are rare unless someone write patch and send pull request. Please note that there would be good chance that patch is rejected.
 [2015-05-01 17:42 UTC] cmb@php.net
-Status: Open +Status: Closed -Assigned To: +Assigned To: cmb
 [2015-05-01 17:42 UTC] cmb@php.net
Thank you for your bug report. This issue has already been fixed
in the latest released version of PHP, which you can download at 
http://www.php.net/downloads.php

fixed in 5.5.24 and 5.6.8
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Wed Dec 04 08:01:29 2024 UTC