php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #47070 php_stream_locate_url_wrapper fails without authority section
Submitted: 2009-01-12 03:01 UTC Modified: 2021-06-28 16:51 UTC
Votes:5
Avg. Score:4.8 ± 0.4
Reproduced:4 of 4 (100.0%)
Same Version:2 (50.0%)
Same OS:1 (25.0%)
From: darrel dot opry at gmail dot com Assigned: cmb (profile)
Status: Wont fix Package: Streams related
PHP Version: 5.2.8 OS: Ubuntu 8.10
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: darrel dot opry at gmail dot com
New email:
PHP Version: OS:

 

 [2009-01-12 03:01 UTC] darrel dot opry at gmail dot com
Description:
------------
php_stream_locate_url_wrapper fails without authority section.

If a URL is not in the Common Internet Scheme Syntax (scheme://<<user>:<pass>@>host<:port>/url-path) the scheme is not be located properly. So URLs of scheme:relative/path and scheme:/absolute/path will not be handled by the registered user defined stream wrapper.

see http://www.ietf.org/rfc/rfc1738.txt for URL specific syntax.

see section 3 of http://labs.apache.org/webarch/uri/rfc/rfc3986.html#intro for URI syntax.

Reproduce code:
---------------
<?php

class wrapper {
  function stream_open() {
    print_r(func_get_args());
    return TRUE;
  }
}

stream_wrapper_register('public', 'wrapper');

fopen('public:path/file.txt', 'r+');
fopen('public:/path/file.txt', 'r+');
fopen('public://path/file.txt', 'r+');


Expected result:
----------------
I expect wrapper::stream_open to be called and fopen to print_r the function arguments.

Array
(
    [0] => public:path/file.txt
    [1] => r+
    [2] => 4
    [3] => 
)
Array
(
    [0] => public:/path/file.txt
    [1] => r+
    [2] => 4
    [3] => 
)
Array
(
    [0] => public://path/file.txt
    [1] => r+
    [2] => 4
    [3] => 
)

Actual result:
--------------
Warning: fopen(public:path/file.txt): failed to open stream: No such file or directory in /home/dopry/public_html/drupal-media/sites/default/modules/media/resource/test.php on line 13

Warning: fopen(public:/path/file.txt): failed to open stream: No such file or directory in /home/dopry/public_html/drupal-media/sites/default/modules/media/resource/test.php on line 14
Array
(
    [0] => public://path/file.txt
    [1] => r+
    [2] => 4
    [3] => 
)


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-01-15 15:26 UTC] jani@php.net
RTFM: http://www.php.net/fopen  (look for explanation what PHP expects the filename parameter to look like...you can put as many RFCs here, but it's never said anywhere that fopen() expects some RFCs style scheme.. :)
 [2009-01-15 16:14 UTC] darrel dot opry at gmail dot com
I did RTFM. As a developer and user of PHP I disagree with your 
assertion. The use and parsing of the URL schema for streams is not 
the same as that of parse_url which is recommended for parsing 
incoming paths.

The documentation specifically uses the wording URL in many places, 
right to the point of having a setting for allow_url_fopen. 

The issue with the current approach is that it does not recognize 
legally delimited scheme's properly, and pass them to the underlying 
stream wrapper. It seems to make the invalid assumption that :// or 
// is the scheme delimiter when it is in fact :. It's a minor parsing 
issue that could probably be corrected by an experienced C developer 
in under 20 minutes.

Why is this an issue you may ask... 

parse_url properly parses the scheme, user, password, host, port, and 
path. 

If I want to use parse_url within my stream wrapper I have to 
concatenate the host and path elements if I use a URL in the form of 
file://path/to/file.txt or I have to use a url in the form of 
file://localhost/path/to/file.txt if I want to avoid this 
concatenation.

A secondary impact, if I'm storing URLs to resources in the database 
I now have to store the authority section as well. This is suboptimal 
for me as a application developer as the number of resources I'm 
storing references to increases. 

So maybe you should go RYOFM, before dismissing something out of hand 
just because it references and RFC without thinking of the 
implications.

Basically I think most developers really want their stream wrapper 
called if scheme: is properly designated, so they can write standards 
compliant code.
 [2009-01-15 16:25 UTC] darrel dot opry at gmail dot com
a note regarding backwards compatibility. This change shouldn't 
interfere with existing user defined stream wrappers, since they 
would not be called with if they used without the authority section 
currently.
 [2009-07-27 08:39 UTC] jani@php.net
Reclassified.
 [2011-04-08 18:11 UTC] jani@php.net
-Package: Feature/Change Request +Package: Streams related
 [2011-11-12 05:50 UTC] dopry at rynassociates dot com
I believe the code that needs to be modified to be in main/streams/streams.c

in the function 
php_stream_locate_url_wrapper

// iterate over path while the current character is alpha numeric
// +, -, or .
for (p = path; isalnum((int)*p) || *p == '+' || *p == '-' || *p == '.'; p++) {
		n++;
}

// if the current value of p is :  and n is not the first character
// and the characters following p are // || the first five charaters are not 
// data:
	if ((*p == ':') && (n > 1) && (!strncmp("//", p+1, 2) || (n == 4 && 
!memcmp("data:", path, 5)))) {
		protocol = path;
	} else if (n == 5 && strncasecmp(path, "zlib:", 5) == 0) {
		/* BC with older php scripts and zlib wrapper */
		protocol = "compress.zlib";
		n = 13;
		php_error_docref(NULL TSRMLS_CC, E_WARNING, "Use of \"zlib:\" 
wrapper is deprecated; please use \"compress.zlib://\" instead");
	}

I believe the solution to be removing (!strncmp("//", p+1, 2) from the if 
statement above so that // is not a required part of the validation. 

The following logic for returning the stream wrapper looks like it can be 
further optimized by re-ordering/grouping some of the conditionals.
 [2011-11-12 07:29 UTC] darrel dot opry at gmail dot com
Here is my non-C developers take on how this function could be re-implemented to 
be a little easier to follow.  My general take is that the 'file' scheme 
handling in this function (authority and path parsing) is out of place and 
should be moved the to actual plain files stream wrapper.

In general I believe stream wrappers and code utilizing stream wrappers could be 
made a little more efficient by fully parsing the URL in this function, possible 
with parse URL and using the resulting scheme returned to select a stream 
wrapper and passing the results of the same parse_url into the stream methods 
instead of the URL itself. at the very least passing the pre-parsed authority 
and path sections into the stream_open and the parsed out path in to the path. 
This would save developers making a duplicate call to parse_url from interpreted 
code.    

/* {{{ php_stream_locate_url_wrapper */
PHPAPI php_stream_wrapper *php_stream_locate_url_wrapper(const char *path, char 
**path_for_open, int options TSRMLS_DC)
{
	HashTable *wrapper_hash = (FG(stream_wrappers) ? FG(stream_wrappers) : 
&url_stream_wrappers_hash);
	php_stream_wrapper **wrapper = NULL;
	const char *ptr_scheme_delimiter = NULL;
	const char *scheme = NULL;
	int scheme_delimiter_position = 0;

	if (path_for_open) {
		*path_for_open = (char*)path;
	}

	if (options & IGNORE_URL) {
		return (options & STREAM_LOCATE_WRAPPERS_ONLY) ? NULL : 
&php_plain_files_wrapper;
	}

	// Loop over path as long as we have valid protocol characters 
[:alpha:,+,-,.] until we reach the scheme delimiter.
	for (ptr_scheme_delimiter = path; isalnum((int)*ptr_scheme_delimiter) || 
*ptr_scheme_delimiter == '+' || *ptr_scheme_delimiter == '-' || 
*ptr_scheme_delimiter == '.'; p++) {
		scheme_delimiter_position++;
	}

	// Why is 'data:' being checked for here?
	if ((*ptr_scheme_delimiter == ':') && (scheme_delimiter_position > 1) && 
(scheme_delimiter_position == 4 && !memcmp("data:", path, 5))) {
		scheme = estrndup(path, scheme_delimiter_position);
	}

	// convert zlib stream wrapper name. This should be removed or 
compress.zlib should be 
	// registered under both schemes until it is officially removed.
	if (scheme_delimiter_position == 5 && !strncasecmp(scheme, "zlib", 5)) {
		scheme = "compress.zlib";
		scheme_delimiter_position = 13;
		php_error_docref(NULL TSRMLS_CC, E_WARNING, "Use of \"zlib:\" 
wrapper is deprecated; please use \"compress.zlib://\" instead");
	}

    // if we matched a schema...
	if (scheme) {
		// attempt to lookup the wrapper from the wrapper hash using the 
scheme.
		if (FAILURE == zend_hash_find(wrapper_hash, (char*)scheme, 
scheme_delimiter_position + 1, (void**)&wrapper)) {
			char wrapper_name[32];
			if (n >= sizeof(wrapper_name)) {
				n = sizeof(wrapper_name) - 1;
			}
			PHP_STRLCPY(wrapper_name, protocol, 
sizeof(wrapper_name), scheme_delimiter_position);
			php_error_docref(NULL TSRMLS_CC, E_WARNING, "Unable to 
find the wrapper \"%s\" - did you forget to enable it when you configured PHP?", 
wrapper_name);
			wrapper = NULL;
		}
	}

	// If we didn't find a wrapper yet we should try defaulting to the file 
scheme.
	if (!wrapper)
		// if the configuration says we support wrappers only. We should 
exit now before moving farther.
		if(options & STREAM_LOCATE_WRAPPERS_ONLY) {
			if (options & REPORT_ERRORS) {
					php_error_docref(NULL TSRMLS_CC, 
E_WARNING, "Unable to find a suitable stream wrapper for %s.", path);
			}
			return NULL;
		}

		// let proceed with the assumption we're working with files and 
try to locate the wrapper.
		if (FG(stream_wrappers)) {
			/* Check again, the original check might have not known 
the protocol name */
			if (!wrapper && zend_hash_find(wrapper_hash, "file", 
sizeof("file"), (void**)&wrapper) == FAILURE) {
				if (options & REPORT_ERRORS) {
					php_error_docref(NULL TSRMLS_CC, 
E_WARNING, "file: wrapper is disabled in the server configuration");
				}
				return NULL;
			}
		}
	}

    // if we have a wrappper and matched a scheme and it wasn't file... lets 
handle some errors.
    if (wrapper && scheme && !strncasecmp(scheme, "file", 4) {
		if ((*wrapper)->is_url 	&& (options & 
STREAM_DISABLE_URL_PROTECTION) == 0 
			&& (!PG(allow_url_fopen) 
				|| (((options & STREAM_OPEN_FOR_INCLUDE) || 
PG(in_user_include)) && !PG(allow_url_include))
			)
		) {
				if (options & REPORT_ERRORS) {
					/* protocol[n] probably isn't '\0' */
					char *protocol_dup = estrndup(protocol, 
n);
					if (!PG(allow_url_fopen)) {
						php_error_docref(NULL TSRMLS_CC, 
E_WARNING, "%s: wrapper is disabled in the server configuration by 
allow_url_fopen=0", protocol_dup);
					} else {
						php_error_docref(NULL TSRMLS_CC, 
E_WARNING, "%s: wrapper is disabled in the server configuration by 
allow_url_include=0", protocol_dup);
					}
					efree(protocol_dup);
				}
				return NULL;
		}
		return *wrapper;
	}

	// if we've gotten here we're using file whether it was defaulted or 
not....
	// the following code could be simplified if internalized to php plain 
files wrapper.
	// in general the work of handling the authoritry and path sections of 
the URI should be handed off 
	// to the stream wrappers. It would be real nice if the stream wrapper 
methods received a parsed url 
	// in their stream_open methods saving developers the trouble of parsing 
out authority and
	// path in interpreted code .


    int localhost = 0;
	if (!strncasecmp(path, "file://localhost/", 17)) {
		localhost = 1;
	}

	// Validate that we're only using localhost
#ifdef PHP_WIN32
	if (localhost == 0 && path[scheme_delimiter_position+3] != '\0' && 
path[scheme_delimiter_position+3] != '/' && path[scheme_delimiter_position+4] != 
':')	{
#else
	if (localhost == 0 && path[scheme_delimiter_position+3] != '\0' && 
path[scheme_delimiter_position+3] != '/') {
#endif
		if (options & REPORT_ERRORS) {
			php_error_docref(NULL TSRMLS_CC, E_WARNING, "remote host 
file access not supported, %s", path);
		}
     	return NULL;
	}

	// so fix up the paths for files... 
	if (path_for_open) {
			/* skip past protocol and :/, but handle windows 
correctly */
			*path_for_open = (char*)path + scheme_delimiter_position 
+ 1;
			if (localhost == 1) {
				(*path_for_open) += 11;
			}
			while (*(++*path_for_open)=='/');
#ifdef PHP_WIN32
			if (*(*path_for_open + 1) != ':')
#endif
				(*path_for_open)--;
			}
	}

	return &php_plain_files_wrapper;
}
 [2021-06-28 16:51 UTC] cmb@php.net
-Status: Open +Status: Wont fix -Assigned To: +Assigned To: cmb
 [2021-06-28 16:51 UTC] cmb@php.net
Given that this feature request is open for more than ten years,
and that there hasn't been much traction, I'm closing as WONTFIX.

If anybody is still interested in this, please pursue
the RFC process[1].

[1] <https://wiki.php.net/rfc/howto>
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Nov 22 09:01:29 2024 UTC