php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #25778 parse_url is not RFC 2396 compliant
Submitted: 2003-10-07 10:02 UTC Modified: 2003-10-08 08:01 UTC
Votes:1
Avg. Score:3.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:1 (100.0%)
From: gmirchev at usa dot net Assigned:
Status: Wont fix Package: URL related
PHP Version: 4.3.3 OS: Linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: gmirchev at usa dot net
New email:
PHP Version: OS:

 

 [2003-10-07 10:02 UTC] gmirchev at usa dot net
Description:
------------
parse_url does not correctly handle these relative URLs:

a.cgi?keywords=6:54+
a.cgi?keywords=6:54

In the reproduce code is correct PHP implementation.


Reproduce code:
---------------
	function url_parse($url)
	{
		$parts = array(
			'scheme' => '',
			'host' => '',
			'port' => '',
			'user' => '',
			'pass' => '',
			'path' => '',
			'query' => '',
			'fragment' => ''
		);

		# Regular Expression from RFC 2396 (appendix B)
		preg_match('"^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?"', $url, $matches);
		
		if (array_key_exists(2, $matches)) $parts['scheme'] = $matches[2];
		if (array_key_exists(4, $matches)) $authority = $matches[4];
		if (array_key_exists(5, $matches)) $parts['path'] = $matches[5];
		if (array_key_exists(7, $matches)) $parts['query'] = $matches[7];
		if (array_key_exists(9, $matches)) $parts['fragment'] = $matches[9];
		
		# Extract username, password, host and port from authority
		preg_match('"(([^:@]*)(:([^:@]*))?@)?([^:]*)(:(.*))?"', $authority, $matches);

		if (array_key_exists(2, $matches)) $parts['user'] = $matches[2];
		if (array_key_exists(4, $matches)) $parts['pass'] = $matches[4];
		if (array_key_exists(5, $matches)) $parts['host'] = $matches[5];
		if (array_key_exists(7, $matches)) $parts['port'] = $matches[7];
		
		return $parts;
	}



Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2003-10-08 07:32 UTC] gmirchev at usa dot net
No! Those are URLs. Please read the RFC.


RFC 2396 part 4:

URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]

also RFC 2396:

1.2. URI, URL, and URN

   A URI can be further classified as a locator, a name, or both.  The
   term "Uniform Resource Locator" (URL) refers to the subset of URI
   that identify resources via a representation of their primary access
   mechanism (e.g., their network "location"), rather than identifying
   the resource by name or by some other attribute(s) of that resource.
   The term "Uniform Resource Name" (URN) refers to the subset of URI
   that are required to remain globally unique and persistent even when
   the resource ceases to exist or becomes unavailable.
 [2003-10-08 08:01 UTC] wez@php.net
parse_url() is only intended to be used with fully qualified URLs.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Apr 25 10:01:29 2024 UTC