php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #64187 CGI/FastCGI truncates input to modulo 4GB
Submitted: 2013-02-11 01:57 UTC Modified: 2014-12-30 10:41 UTC
Votes:22
Avg. Score:4.5 ± 0.8
Reproduced:20 of 21 (95.2%)
Same Version:15 (75.0%)
Same OS:16 (80.0%)
From: nachms+php at gmail dot com Assigned: mike (profile)
Status: No Feedback Package: Streams related
PHP Version: 5.6 OS: Linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: nachms+php at gmail dot com
New email:
PHP Version: OS:

 

 [2013-02-11 01:57 UTC] nachms+php at gmail dot com
Description:
------------
I've tested sending huge amounts of data via PUT to PHP from 5.3.x through 5.4.x via CGI, FastCGI, and mod_php for Apache, all on AMD64.

In all my tests, mod_php with Apache seems to be fine. However via CGI or FastCGI, PHP can only see the amount of data modulo 4294967296. Which seems to indicate that somewhere an int instead of a long is used in the the CGI processing code, but so far, I have been unable to find where exactly. All my builds are 64-bit, so that's not the issue.

To elaborate, via mod_php, if I send via HTTP PUT 4296015872 bytes, then PHP will see all of them. However, via CGI or FastCGI, PHP will only see 1048576 bytes.

Test script:
---------------
<?php
print_r($_SERVER); //Print server variables so we can see Content-Length
 
$amount = 0;
$ifp = @fopen('php://input', 'rb');
if ($ifp)
{
  while (!feof($ifp))
  {
    set_time_limit(0);
    $buf = fread($ifp, 8192);
    if ($buf !== false) { $amount += strlen($buf); }
  }
  fclose($tfp);
  set_time_limit(0);
 
  echo 'Amount Read: ', $amount, "\n";
}

//Test via HTTP (Apache): http://paste.nachsoftware.com/Nach/TXmPR8289646bbb54bf40ce295115111acde1eYP
//Test via CGI (with FastCGI sharing results) http://paste.nachsoftware.com/Nach/MXNpV4ec2e499fc0941773aff51184e6e618d2lN

Expected result:
----------------
When tested with either of my C test programs, I expect to see 4296015872 listed as the amount read.

Actual result:
--------------
With mod_php in Apache, I see 4296015872 which is correct. But with CGI/FastCGI, I see 1048576 as the amount read, which is 4296015872%4294967296, which indicates, somewhere a long is being converted to an int within the CGI/FastCGI code.

Patches

truncate_fix.diff (last revision 2013-02-11 10:04 UTC by nachms+php at gmail dot com)

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2013-02-11 02:37 UTC] nachms+php at gmail dot com
I think I found the bug in cgi_main.c:

static int sapi_cgi_read_post(char *buffer, uint count_bytes TSRMLS_DC)
{
	uint read_bytes = 0;
	int tmp_read_bytes;

	count_bytes = MIN(count_bytes, (uint) SG(request_info).content_length - SG(read_post_bytes));
	while (read_bytes < count_bytes) {
		tmp_read_bytes = read(STDIN_FILENO, buffer + read_bytes, count_bytes - read_bytes);
		if (tmp_read_bytes <= 0) {
			break;
		}
		read_bytes += tmp_read_bytes;
	}
	return read_bytes;
}

It looks like content_length, a long, is being truncated to a uint. I'll look into fixing this based on what mod_php5 does.
 [2013-02-11 10:11 UTC] nachms+php at gmail dot com
In cgi_main.c, in sapi_cgi_read_post() and sapi_fcgi_read_post(), I found if I comment out the line: count_bytes = MIN(count_bytes, (uint) SG(request_info).content_length - SG(read_post_bytes));
Then the bug is fixed.

I'm not sure why the amount of bytes going to be read is bounded to read_post_bytes subtracted from the length. read() will only read the amount that it can, and it doesn't matter if you asked for too much, it will return what is available.

For FastCGI, I'm not familiar enough with fcgi_read() enough to know if the lack of bounding causes a problem or not. However, mod_php5 as an example doesn't bound it. And if bounding is needed, this code needs to make use of long or unsigned long instead of int and unit.
 [2013-02-11 19:27 UTC] nachms+php at gmail dot com
Due to lack of comments as mentioned above, I'm unsure what the problematic loop is needed for. However thinking more about it, perhaps it's needed for pipelining or multiplexing?

In which case, changing read_post_bytes in SAPI.h from int to long, and removing the uint cast from SG(request_info).content_length may be the correct solution.
 [2013-02-19 02:27 UTC] payden at paydensutherland dot com
I may be way off here, but from what I can see in SAPI.c (for 5.4.11, line 266 is 
where the callback is invoked), count_bytes is the number of bytes that the 
sapi_module_struct->read_post callback can safely stuff in the buffer without 
overflowing its bounds.  I think ignoring count_bytes in the callback is probably 
a bad idea.  Just my two cents.  I'll be looking more into it and I'll post here 
if I come up with a solution.
 [2013-02-19 02:31 UTC] payden at paydensutherland dot com
Oh, I'm sorry.  I must have misread it before.  I see you're not ignoring 
count_bytes.  You're just taking out the MIN() on count_bytes, and remaining data 
to be read.  Let me keep my mouth shut until I come up with something intelligent 
to say.  :)
 [2013-02-19 04:32 UTC] nachms+php at gmail dot com
Problems in PHP are also a bit larger than I described here, although perhaps should be filed as a separate bug.

32-bit OSs generally have "large file support", and can support handling data at much larger than 4GB. On most UNIXs, getconf can indicate appropriate flags to enable such support. On Windows, large file support is always available.

Ideally PHP should ensure such support is available and properly used. For starters, Content-Length header is stored within a long. It should be stored in a type guaranteed to be 64 bits, and not depend if the system itself is 32 or 64 bit.

It is okay to limit the amount of data that can be read at once is limited to 32-bit, even on a 64-bit platform. But the overall size on files or input streams from pipes and sockets should not be.
 [2013-02-19 05:20 UTC] payden at paydensutherland dot com
Hey,

I did a little testing and have some findings to share.  I believe your fix 
works perfectly fine with php-fpm and it does not in fact need to be bounded for 
fcgi_read to work correctly.  I wanted to duplicate your initial test over fpm 
and see what happened.  With the the bounding in place, I got some weird results 
with fpm.  The PHP script stopped reading and finished executing at exactly 
2147483647 bytes.  (signed 32-bit int max)  When I commented the MIN() out and 
rebuilt, the script read the entirety of the 4296015872 bytes I sent it and 
reported reading that amount.  I used the same PHP code you used for the test 
and a hacked together C FCGI client.  I am using a 32-bit build of PHP.  I don't 
know if any of this information is useful for you, but I was bored and would 
kind of like to start watching bugs and getting involved a little bit.  Let me 
know if I'm going about it the wrong way!
 [2013-02-19 05:34 UTC] nachms+php at gmail dot com
payden, thanks for the info. It's nice to know that the fix works properly with FPM builds as well, and even on 32-bit!

I wouldn't mind testing that out myself. Can you post your C FCGI client? Thanks.

Allowing others to easily test and report if they can reproduce the problem or not in other cases may help the PHP developers too (or not, no idea how important the votes and statistics are).
 [2013-02-20 03:31 UTC] payden at paydensutherland dot com
Hey nachms,

I played around a bit more and it seems that the count_bytes comment does in 
fact break things in FPM.  I don't know why I didn't notice it the other day.  
It seems that FPM doesn't clean things up properly or terminate the connection 
even though it does recv all the bytes sent across the wire.  It behaves fine 
during the send and will output to FCGI_STDOUT records while it's still reading 
FCGI_STDIN and I had the PHP script report the amount of bytes read after each 
call to fread().  It does in fact read all the bytes, but as I said, FPM does 
not close the connection after it sends FCGI_END_REQUEST.  I also notice that it 
left some junk in the reserved bytes of the FCGI_END_REQUEST body which is 
definitely broken behavior.  Anywho, I might see if I can come up with a fix 
sometime.  I'm sure it's not high priority for the PHP folks.  It's not often 
one sends 4GB+ of data over FastCGI to PHP.  Now, no judging me on this because 
it is definitely hacky and thrown together and buggy, but I'll throw the C I've 
been using to test this up on my server.  http://paydensutherland.com/php-
64187.c.  Maybe you can do something useful with it.  The FCGI_PARAMS are hard 
coded, so you'll need to point the DOCUMENT_ROOT and SCRIPT_FILENAME and such to 
actual paths on your system.  Cheers.
 [2013-02-20 03:44 UTC] payden at paydensutherland dot com
Also, I didn't mention, my program hangs waiting for FPM to close the connection 
with the count_bytes line commented out.  With the count_bytes line, this is my 
output:

FCGI_STDOUT:
======
X-Powered-By: PHP/5.4.11
Content-type: text/html

Array
(
    [USER] => www-data
    [HOME] => /var/www
    [FCGI_ROLE] => RESPONDER
    [QUERY_STRING] => test
    [REQUEST_METHOD] => PUT
    [CONTENT_TYPE] => 
    [CONTENT_LENGTH] => 4296015872
    [SCRIPT_NAME] => /index.php
    [REQUEST_URI] => /
    [DOCUMENT_URI] => /index.php
    [DOCUMENT_ROOT] => /home/payden
    [SERVER_PROTOCOL] => HTTP/1.1
    [GATEWAY_INTERFACE] => CGI/1.1
    [SCRIPT_FILENAME] => /home/payden/index.php
    [SERVER_NAME] => test.localhost.net
    [HTTP_HOST] => test.localhost.net
    [REMOTE_ADDR] => 127.0.0.1
    [REMOTE_PORT] => 12312
    [SERVER_ADDR] => 127.0.0.1
    [SERVER_PORT] => 80
    [PHP_SELF] => /index.php
    [REQUEST_TIME_FLOAT] => 1361331663.522
    [REQUEST_TIME] => 1361331663
    [argv] => Array
        (
            [0] => test
        )

    [argc] => 1
)
Amount Read: 2147483647

======
FCGI_END_REQUEST
Connection closed.
Actually sent: 2155288000

So, it finishes cleanly with stock 5.4.11, but the number of bytes actually sent 
before FPM closes the connection varies and the PHP side of things always 
reports max 32-bit signed int as amount read.
 [2013-02-20 03:47 UTC] nachms+php at gmail dot com
Hi payden,

Thanks for the info and your test script. I'll try to run some tests myself when I get a chance. No worries about hardcoded data, you'll see the C files I provided hardcode data as well.

Regarding the comment breaking, did you test to ensure things are not broken with the original un-commented PHP code?

Also, the PHP I provided actually contains a typo, fclose($tfp); should be fclose($ifp);
Perhaps this is why things are being left open?

As for the 4GB limitation, indeed 99% of users would not care. However I created a file-sharing website using PHP, and ran into this ridiculous limitation. For the time being, I'm using a slight variation on my patch here on my Apache+FastCGI+PHP5-CGI site. I have yet to play with the PHP5-FPM SAPI though, but hope to thanks to your script.
 [2014-07-13 01:08 UTC] yohgaki@php.net
-Status: Open +Status: Feedback
 [2014-07-13 01:08 UTC] yohgaki@php.net
PHP 5.5 and up should be able to handle large input. Do you still have this issue?
 [2014-07-13 04:55 UTC] nachms+php at gmail dot com
-Status: Feedback +Status: Open
 [2014-07-13 04:55 UTC] nachms+php at gmail dot com
I just tested with 5.5.12, getting the same results as I did with 5.3.x and 5.4.x, so yes, the bug has not been fixed.

I noticed my pastes with test code expired. So I quickly whipped up some more:
HTTP: http://paste.nachsoftware.com/Nach/LnpHS6e57f952e3f56df4fcc753c210e8401a7vM
CGI: http://paste.nachsoftware.com/Nach/vCDhzd84835a1b4439c96ad0353c2111051a09zj
 [2014-07-13 09:53 UTC] yohgaki@php.net
-Status: Open +Status: Feedback
 [2014-07-13 09:53 UTC] yohgaki@php.net
Sorry for typo. Could you try 5.6?
 [2014-07-13 11:02 UTC] nachms+php at gmail dot com
-Status: Feedback +Status: Open
 [2014-07-13 11:02 UTC] nachms+php at gmail dot com
I don't have any version of 5.6 handy. It seems official downloads are supposed to be at http://qa.php.net/, but for some reason, that site isn't loading for me. Is there an alternate location to download 5.6 from?
 [2014-07-13 11:09 UTC] nachms+php at gmail dot com
Okay, I found a binary at: http://ftp.us.debian.org/debian/pool/main/p/php5/php5-cgi_5.6.0~rc2+dfsg-3_amd64.deb

I'm now getting different results than 5.3, 5.4, and 5.5, but it still fails.

Amount Read: 1374904320 (Should be Amount Read: 4296015872)
Was only able to send 1374978048 bytes. (Should not get this error)
 [2014-07-15 22:01 UTC] yohgaki@php.net
-Status: Open +Status: Feedback
 [2014-07-15 22:01 UTC] yohgaki@php.net
Starting from 5.6, PHP handles large 'php://input' and makes 'php://input' reusable. This change will not be backported to older versions.

If this is fixed in 5.6, this bug should be closed.
Do you still have problem in 5.6 or later?
 [2014-07-15 22:06 UTC] nachms+php at gmail dot com
-Status: Feedback +Status: Open
 [2014-07-15 22:06 UTC] nachms+php at gmail dot com
Yes, as I said in my previous post, it fails in php5-cgi_5.6.0~rc2+dfsg-3_amd64.deb, which as far as I know is "5.6 or later".

Amount Read: 1374904320 (Should be Amount Read: 4296015872)
Was only able to send 1374978048 bytes. (Should not get this error)
 [2014-07-16 07:23 UTC] mike@php.net
-PHP Version: 5.4.11 +PHP Version: 5.6 -Assigned To: +Assigned To: mike
 [2014-07-16 19:57 UTC] mike@php.net
-Status: Assigned +Status: Feedback
 [2014-07-16 19:57 UTC] mike@php.net
Please try with a source build.
I've no problems POSTing or PUTting files of size 10G.

█ mike@smugmug:~/tmp$ l -hl 10G
-rw-r--r-- 1 mike users 9.8G  1. Jul 21:44 10G
█ mike@smugmug:~/tmp$ cat /srv/http/put.php
<?php

$tmp = tempnam("/var/tmp", "put");
var_dump(file_put_contents($tmp, fopen("php://input","r")), filesize($tmp), unlink($tmp));

█ mike@smugmug:~/build/php-5.6-dbg$ ./sapi/cgi/php-cgi -b 0:9999 -d post_max_size=99G -d upload_max_filesize=99G

█ mike@smugmug:~/tmp$ curl -v --upload-file ~/tmp/10G http://localhost:88/put.php
* Hostname was NOT found in DNS cache
*   Trying ::1...
* connect to ::1 port 88 failed: Connection refused
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 88 (#0)
> PUT /put.php HTTP/1.1
> User-Agent: curl/7.37.0
> Host: localhost:88
> Accept: */*
> Content-Length: 10485760000
> Expect: 100-continue
> 
< HTTP/1.1 100 Continue
* We are completely uploaded and fine


< HTTP/1.1 200 OK
* Server nginx/1.6.0 is not blacklisted
< Server: nginx/1.6.0
< Date: Wed, 16 Jul 2014 19:47:33 GMT
< Content-Type: text/html; charset=UTF-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< X-Powered-By: PHP/5.6.0-dev
< 
int(10485760000)
int(10485760000)
bool(true)
* Connection #0 to host localhost left intact
 [2014-07-16 21:20 UTC] nachms+php at gmail dot com
-Status: Feedback +Status: Assigned
 [2014-07-16 21:20 UTC] nachms+php at gmail dot com
I played more with 5.6, finding it odd that we're being limited to ~1.7GB (which seems to have no mathematical significance), whereas in PHP 5.3-5.5, we were seeing the size upload capped to Content-Length%4GB.

Then I noticed that's how much free space is available in /tmp on our test server.

In PHP < 5.6, uploads via PUT were streamed, so the PHP script could save wherever it wanted to as it was being uploaded, or stream the data into a database or remote server, or just run some formulas on the data without storing any of raw PUT data.

In PHP 5.6, it seems the truncation bug has been corrected, but in the process the data is no longer streamable, making it difficult to effectively deal with huge files, introducing issues into past cases that worked (because they were <4GB).

Is there a way to tell PHP 5.6 to not buffer/save uploads via PUT? PUT, unlike POST doesn't need PHP to parse the data to put into the various superglobals.
 [2014-07-17 06:26 UTC] mike@php.net
Thank you for testing!

Yes, the php://input stream is creating a temporary file in INI(upload_tmp_dir).
This cannot be disabled ATM. 

If you set INI(enable_post_data_reading) to "Off", it will defer creation until you actually read it, i.e. it will write to the temp file as you read from the stream.

If INI(enable_post_data_reading) is "Off" and php://input is not used, the input data will be discarded at the end of the request.
 [2014-07-17 11:19 UTC] nachms+php at gmail dot com
> Yes, the php://input stream is creating a temporary file in INI(upload_tmp_dir).
This cannot be disabled ATM. 

Based on some superficial testing, it appears that only POST is obeying that variable, but not PUT, but we'll do more testing to be certain.

It may also make sense to have two separate variables to control PUT and POST.

> If you set INI(enable_post_data_reading) to "Off", it will defer creation until you actually read it, i.e. it will write to the temp file as you read from the stream.

We need to do more testing here too, but after talking to our security team about this change to PHP, they seem to think there's a good chance PHP 5.6 can be DoS'd by PUTing a huge amount of data to any script, at least when automatic file creation is enabled.

Hopefully, we'll have more testing done later today.
 [2014-07-17 13:18 UTC] mike@php.net
Sorry for the confusion. INI(enable_post_data_reading) only affects POSTs not PUTs. Anything non-POST is *not* automatically read into the temp stream.
 [2014-07-28 18:15 UTC] mike@php.net
-Status: Assigned +Status: Feedback
 [2014-07-28 18:15 UTC] mike@php.net
Anything to add?
 [2014-12-30 10:41 UTC] php-bugs at lists dot php dot net
No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Re-Opened". Thank you.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Dec 05 02:01:30 2024 UTC