php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #77889 URL in Location header not used
Submitted: 2019-04-14 04:06 UTC Modified: 2021-02-16 18:09 UTC
From: ASchmidt at Anamera dot net Assigned: cmb (profile)
Status: Closed Package: Streams related
PHP Version: 7.2.17 OS: Windows x64
Private report: No CVE-ID: None
 [2019-04-14 04:06 UTC] ASchmidt at Anamera dot net
Description:
------------
For $host = 'www.finecars.cc', the HTML content is correctly received.

For $host = 'finecars.cc', the initial response is:
HTTP/1.1 301 Moved Permanently
Location: http://www.finecars.cc/

However, after that, PHP does NOT actually retrieve "www.finecars.cc" as defined in the Location header, but continues to retry the original URL "finecars.cc" until 'max_redirects' is exhausted.

Test script:
---------------
<?php
declare(strict_types=1);

$host = 'finecars.cc';

$headers = array(
    'Host'              =>  $host,
    'User-Agent'        =>  'Anamera/2.0',
    'Accept-Charset'    =>  'UTF-8',
    'Referer'           =>  ( '0' == $_SERVER['SERVER_PORT_SECURE'] ? 'http' : 'https' )."://{$_SERVER['SERVER_NAME']}{$_SERVER['REQUEST_URI']}",
    'Connection'        =>  'close',
    'Origin'            =>  ( '0' == $_SERVER['SERVER_PORT_SECURE'] ? 'http' : 'https' )."://{$_SERVER['SERVER_NAME']}",
);

$header_lines = [];
foreach ( $headers as $name => $entry )
    $header_lines[] = "{$name}: {$entry}";

$http_options = array(
    'http' => array(
        'protocol_version'  =>  1.1,
        'timeout'           =>  30,         // float: seconds.
        'follow_location'   =>  1,
        'max_redirects'     =>  5,
        'ignore_errors'		=>	true,       // fetch content even on failure status codes.
        
        'method'            =>	'GET',
        'header'            =>	$header_lines,
//      'user_agent'        =>  '',         // use: 'header'['User-Agent'].
//      'content'           =>  '',         // for POST and PUT.
    ),
);

$http_context = stream_context_create( $http_options );
$file = file_get_contents( "http://{$host}", false, $http_context );

var_dump( $http_options, $http_response_header );
die( 'Current PHP version: ' . phpversion() );
?>


Expected result:
----------------
  0 => string 'HTTP/1.1 301 Moved Permanently' (length=30)
  1 => string 'Content-Type: text/html; charset=UTF-8' (length=38)
  2 => string 'Location: http://www.finecars.cc/' (length=33)
  3 => string 'Connection: close' (length=17)
  4 => string 'Content-Length: 146' (length=19)
  5 => string 'HTTP/1.1 200 OK' (length=15)
  6 => string 'Content-Type: text/html;charset=iso-8859-1' (length=42)
  7 => string 'Connection: close' (length=17)
  8 => string 'Content-Length: 87608' (length=21)



Actual result:
--------------
  0 => string 'HTTP/1.1 301 Moved Permanently' (length=30)
  1 => string 'Content-Type: text/html; charset=UTF-8' (length=38)
  2 => string 'Location: http://www.finecars.cc/' (length=33)
  3 => string 'Connection: close' (length=17)
  4 => string 'Content-Length: 146' (length=19)
  5 => string 'HTTP/1.1 301 Moved Permanently' (length=30)
  6 => string 'Content-Type: text/html; charset=UTF-8' (length=38)
  7 => string 'Location: http://www.finecars.cc/' (length=33)
  8 => string 'Connection: close' (length=17)
  9 => string 'Content-Length: 146' (length=19)  
  10 => string 'HTTP/1.1 301 Moved Permanently' (length=30)
  11 => string 'Content-Type: text/html; charset=UTF-8' (length=38)
  12 => string 'Location: http://www.finecars.cc/' (length=33)
  13 => string 'Connection: close' (length=17)
  14 => string 'Content-Length: 146' (length=19)
etc.



Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2021-02-16 14:59 UTC] cmb@php.net
-Status: Open +Status: Not a bug -Assigned To: +Assigned To: cmb
 [2021-02-16 14:59 UTC] cmb@php.net
Since you're explicitly setting the Host, finecars.cc is used for
all requests, resulting in the undesired behavior.  Just remove
that entry from the headers array.
 [2021-02-16 16:23 UTC] ASchmidt at Anamera dot net
Since this is not intuitive (and the default setting of follow-location=true effectively bars the use of the "Host:" header), I have submitted a comment to the manual page https://www.php.net/manual/en/context.http.php.

However, as far as the PHP behavior NOT being a bug, HTTP 1.1 made "Host" headers mandatory, and more than ONE Host header is explicitly disallowed (https://tools.ietf.org/html/rfc7230#section-5.4). Standards further explicitly require that "Host" headers are to be replaced, e.g., the original "Host" must NOT not be forwarded (the example of Proxy servers is being cited.)

While I understand/appreciate the "explanation" (work-around), I believe the old PHP behavior is both unexpected, and not consistent with the RFC:

- Since every HTTP 1.1 request must have exactly ONE, and VALID, "Host" header that matches the intended target host, and
- since PHP creates the secondary HTTP request to "follow-location",
- it would mean that PHP is responsible for replacing any previous, obsolete "Host" header, with the correct, valid Host header to match the "Location" response.

Otherwise the resulting "follow-location" request is not compliant with HTTP 1.1 standards.
 [2021-02-16 16:40 UTC] rtrtrtrtrt at dfdfdfdf dot dfd35
> it would mean that PHP is responsible for replacing 
> any previous, obsolete "Host" header, with the correct, 
> valid Host header to match the "Location" response

no, it's a programming language
garbage in, garbage out

there is no point to set a Host-Header and a Location-Header with different values and a programming language is expected to do what you say - even if it's wrong
 [2021-02-16 16:44 UTC] cmb@php.net
The documentation states[1]:

| Values in this option will override other values (such as
| User-agent:, Host:, and Authentication:).

In other words, as soon as you set any custom headers, it is your
responsibility that they are suitable.

[1] <https://www.php.net/manual/en/context.http.php#refsect1-context.http-options>
 [2021-02-16 17:00 UTC] ASchmidt at Anamera dot net
-Type: Bug +Type: Documentation Problem
 [2021-02-16 17:00 UTC] ASchmidt at Anamera dot net
Dear "rtrtrtrtrt@dfdfdfdf.dfd35", thatnk you for your comment.

>> there is no point to set a Host-Header and a Location-Header with different values <<

There appears to be confusion on how HTTP works. The "Host" header is a REQUEST header, set by the application (the PHP script). The "Location" header is a RESPONSE header set by the contacted host.

The whole POINT of a server's "Location" response is to advise the client of a DIFFERENT URL (possibly involving a different "Host"!) that should be contacted. The Location response being different from the originally requested is not "garbage in/out", but intended behavior!?

A PHP script does not know in advance that it will receive a "Location" response. The net effect of this issue is that the "Host" header and the "follow_location" option are mutually exclusive, which should be explicitly stated.
 [2021-02-16 17:55 UTC] cmb@php.net
-Status: Not a bug +Status: Re-Opened
 [2021-02-16 17:55 UTC] cmb@php.net
> Dear "rtrtrtrtrt@dfdfdfdf.dfd35", […]

Just ignore this troll, please.

> The net effect of this issue is that the "Host" header and the
> "follow_location" option are mutually exclusive, […]

Not necessarily.  Consider a redirect from http://example.com/old
to http://example.com/new.

Anyway, adding some info regarding this issue to the docs
certainly won't hurt.
 [2021-02-16 18:09 UTC] ASchmidt at Anamera dot net
>> Consider a redirect from http://example.com/old to http://example.com/new <<

Uh, please keep in mind that the PHP script would NEVER know in advance of any redirect, or, whether any such redirect would happen to only be to another path at the same host.

Consequently, the script author has to ASSUME it at least possible (if not likely) that the Location header might very well supply a different host -- even if only to add/remove the "www." portion, which today is standard practice! 

Ergo, since a change in host name (or subdomain) does have to be allowed for under every circumstance, the "Host" header and the "follow_location" option effectively ARE mutually exclusive, if "follow_location" doesn't supersede any originally supplied host header with the host from "Location" header.
 [2021-02-17 13:28 UTC] cmb@php.net
Automatic comment on behalf of cmbecker69@gmx.de
Revision: http://git.php.net/?p=doc/en.git;a=commit;h=4dfe5cc41eabcfb5b7eb7afa194ce078ad4e1bf4
Log: Fix #77889: URL in Location header not used
 [2021-02-17 13:28 UTC] cmb@php.net
-Status: Re-Opened +Status: Closed
 [2021-02-18 08:59 UTC] mumumu@php.net
Automatic comment on behalf of mumumu@mumumu.org
Revision: http://git.php.net/?p=doc/ja.git;a=commit;h=0104234b746d694d6fd02646c85947f7df6cec84
Log: Fix #77889: URL in Location header not used
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Dec 06 11:01:28 2024 UTC