php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #73262 8192 CHUNK_SIZE and PHP_SOCK_CHUNK_SIZE makes copy() slow
Submitted: 2016-10-07 01:05 UTC Modified: 2019-10-30 12:12 UTC
Votes:1
Avg. Score:3.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: jeffreydwalter at gmail dot com Assigned: nikic (profile)
Status: Closed Package: Performance problem
PHP Version: Irrelevant OS: Irrelevant
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please !
Your email address:
MUST BE VALID
Solve the problem:
34 - 26 = ?
Subscribe to this entry?

 
 [2016-10-07 01:05 UTC] jeffreydwalter at gmail dot com
Description:
------------
***I have not tested this outside of FreeBSD 10.3, but it should be an issue on most if not all platforms***

copy() is several orders of magnitude slower than exec('cp ...') on the same partition. It is even worse copying across partitions.

After digging through the source code, I discovered that it's because of the shear number of buffered write() calls that are made on the underlying socket in _php_stream_write_buffer().

The buffer's size is hardcoded to 8192 <a href="https://github.com/php/php-src/blob/PHP-5.6.20/main/streams/php_streams_int.h#L49">here</a> and <a href="https://github.com/php/php-src/blob/PHP-5.6.20/main/php_network.h#L210">here</a>.

If I change CHUNK_SIZE and PHP_SOCK_CHUNK_SIZE to 8192*24, the php copy() function will perform at a comparable speed to exec('cp ...').

I don't fully understand the ramifications of changing that buffer outside of greater memory usage, and I haven't tested it on network-related stream operations. So if there's a better way I'd love to know.

I do realize that I could create a stream context in php, set the buffer size, and pass that context to copy(), but that doesn't seem like a workable solution.

Perhaps a better solution would be to make those CHUNK_SIZE defines tunable at compile time so they can be optimized for one's specific platform?

Test script:
---------------
<?php

# Change this is you want to test with a different file size.
define('FILE_SIZE_IN_MB', 250);

printf('Generating %sMB dummy file.'.PHP_EOL, FILE_SIZE_IN_MB);
echo shell_exec('dd if=/dev/urandom of=dummy.txt bs=1048576 count='.FILE_SIZE_IN_MB.' 2>&1 | grep -v records');

$copy_start = time();
copy('dummy.txt', 'copied_with_php_copy.txt');
$copy_done = time();

$cp_start = time();
exec('cp dummy.txt copied_with_cp.txt');
$cp_done = time();

echo 'copy() took: '.($copy_done-$copy_start).'s'.PHP_EOL;
echo 'cp took: '.($cp_done-$cp_start).'s'.PHP_EOL;

echo shell_exec('ls -al dummy.txt copied_with_php_copy.txt copied_with_cp.txt').PHP_EOL;

echo 'Removing test files.'.PHP_EOL;
exec('rm -rf dummy.txt copied_with_php_copy.txt copied_with_cp.txt');

Expected result:
----------------
php_stream_write took 0 seconds 796 milliseconds (I added this line in stream.c)
php_stream_write() with mmap used (I added this line in stream.c)

php output:
Generating 250MB dummy file.
copy() took: 1s
cp took: 1s
-rw-r--r--  1   262144000 Oct  6 20:00 copied_with_cp.txt
-rw-r--r--  1   262144000 Oct  6 20:00 copied_with_php_copy.txt
-rw-r--r--  1   262144000 Oct  6 19:59 dummy.txt

Removing test files.


Actual result:
--------------
php_stream_write took 5 seconds 562 milliseconds (I added this line in stream.c)
php_stream_write() with mmap used (I added this line in stream.c)

php output:
php blah.php 
Generating 250MB dummy file.
copy() took: 13s
cp took: 0s
-rw-r--r--  1   262144000 Oct  6 20:00 copied_with_cp.txt
-rw-r--r--  1   262144000 Oct  6 20:00 copied_with_php_copy.txt
-rw-r--r--  1   262144000 Oct  6 19:59 dummy.txt

Removing test files.

Patches

streams.c.5.6.20.unfinished.patch (last revision 2016-10-12 23:26 UTC by jeffreydwalter at gmail dot com)

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2016-10-07 01:13 UTC] jeffreydwalter at gmail dot com
After further testing it seems like setting PHP_SOCK_CHUNK_SIZE = 8192*24 is enough to speed up copy(), which make sense I suppose since it's using a socket to do the writes.
 [2016-10-07 01:15 UTC] jeffreydwalter at gmail dot com
I also forgot to mention that I tested this in PHP 5.6.20 on FreeBSD 10.3, but it appears to be an issue across all versions of PHP, including 7.x.
 [2016-10-09 19:43 UTC] ab@php.net
Thanks for the report. You seem to be on a machine with quite a slow I/O. While it is true, that increasing the buffer will bring a speedup, in some cases it will be on cost of wasted memory. Also, there are likely situations where PHP is even faster. Have you checked various filesizes? OFC it is very platform and HW dependent.

It could be indeed an idea, to set the buffer size through the context on runtime. Another option could be to extend the buffer automatically. Hardcoding the buffer size at compilation time is IMHO not that sensible. Different streams will possibly need different handling. Fe for a stream that only handles a few bytes of data, having a buffer of 192k as you do is for sure an overkill. So a balanced solution could be challenging.

Thanks.
 [2016-10-12 22:09 UTC] jeffreydwalter at gmail dot com
Thanks for the reply. I'm running FreeBSD 10.3 on a VM that's lacking resources, so that's probably why I/O is slow. I have tried running my tests with many different file sizes with the default buffer size and without. In very small file sizes both copy and exec(cp...) were too fast to really tell anything useful. But, copy() is definitely several orders of magnitude slower with larger files. As file size grows, the disparity does too.

If we could come up with a sane way of dynamically scaling the buffer, that would be cool. One simplistic way to do that would be to step up the buffer in size as the number of writes passes certain thresholds. That is to say, as we write x amount of data. This is a naive, but simple approach that would keep the buffer small for smaller streams of data and large for larger streams of data.

I did a little testing with this approach and it seems write speeds are relatively comparable below 10MB files (on this system). Once I get above 10MB, I start to see an increasing disparity between the time copy() and exec('cp ...').

I'm attaching a patch that scales the stream->chunk_size = 8192*24 once 10MB has been read. This gives relatively comparable performance between copy() and exec('cp ...'). This patch does NOT contain the whole solution, as I believe there are several other places that would still need to be modified.


I've updated the test program to run multiple tests on a range of file sizes. Here's a sample of the test program output:

php blah.php (UNMODIFIED streams.c):

Generating 1MB dummy file.
1048576 bytes transferred in 0.017906 secs (58560197 bytes/sec)
copy() took: 0s
cp took: 0s
-rw-r--r--  1   1048576 Oct 12 16:58 copied_with_cp.txt
-rw-r--r--  1   1048576 Oct 12 16:58 copied_with_php_copy.txt
-rw-r--r--  1   1048576 Oct 12 16:58 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 2MB dummy file.
2097152 bytes transferred in 0.034607 secs (60599186 bytes/sec)
copy() took: 0s
cp took: 0s
-rw-r--r--  1   2097152 Oct 12 16:58 copied_with_cp.txt
-rw-r--r--  1   2097152 Oct 12 16:58 copied_with_php_copy.txt
-rw-r--r--  1   2097152 Oct 12 16:58 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 5MB dummy file.
5242880 bytes transferred in 0.084276 secs (62210860 bytes/sec)
copy() took: 0s
cp took: 0s
-rw-r--r--  1   5242880 Oct 12 16:58 copied_with_cp.txt
-rw-r--r--  1   5242880 Oct 12 16:58 copied_with_php_copy.txt
-rw-r--r--  1   5242880 Oct 12 16:58 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 10MB dummy file.
10485760 bytes transferred in 0.163251 secs (64230938 bytes/sec)
copy() took: 0s
cp took: 0s
-rw-r--r--  1   10485760 Oct 12 16:58 copied_with_cp.txt
-rw-r--r--  1   10485760 Oct 12 16:58 copied_with_php_copy.txt
-rw-r--r--  1   10485760 Oct 12 16:58 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 15MB dummy file.
15728640 bytes transferred in 0.247797 secs (63473889 bytes/sec)
copy() took: 1s
cp took: 0s
-rw-r--r--  1   15728640 Oct 12 16:58 copied_with_cp.txt
-rw-r--r--  1   15728640 Oct 12 16:58 copied_with_php_copy.txt
-rw-r--r--  1   15728640 Oct 12 16:58 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 19MB dummy file.
19922944 bytes transferred in 0.308225 secs (64637632 bytes/sec)
copy() took: 0s
cp took: 0s
-rw-r--r--  1   19922944 Oct 12 16:58 copied_with_cp.txt
-rw-r--r--  1   19922944 Oct 12 16:58 copied_with_php_copy.txt
-rw-r--r--  1   19922944 Oct 12 16:58 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 20MB dummy file.
20971520 bytes transferred in 0.324176 secs (64691758 bytes/sec)
copy() took: 0s
cp took: 0s
-rw-r--r--  1   20971520 Oct 12 16:58 copied_with_cp.txt
-rw-r--r--  1   20971520 Oct 12 16:58 copied_with_php_copy.txt
-rw-r--r--  1   20971520 Oct 12 16:58 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 50MB dummy file.
52428800 bytes transferred in 0.832661 secs (62965367 bytes/sec)
copy() took: 1s
cp took: 0s
-rw-r--r--  1   52428800 Oct 12 16:58 copied_with_cp.txt
-rw-r--r--  1   52428800 Oct 12 16:58 copied_with_php_copy.txt
-rw-r--r--  1   52428800 Oct 12 16:58 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 100MB dummy file.
104857600 bytes transferred in 1.651604 secs (63488345 bytes/sec)
copy() took: 2s
cp took: 0s
-rw-r--r--  1   104857600 Oct 12 16:58 copied_with_cp.txt
-rw-r--r--  1   104857600 Oct 12 16:58 copied_with_php_copy.txt
-rw-r--r--  1   104857600 Oct 12 16:58 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 250MB dummy file.
262144000 bytes transferred in 4.271219 secs (61374516 bytes/sec)
copy() took: 7s
cp took: 1s
-rw-r--r--  1   262144000 Oct 12 16:59 copied_with_cp.txt
-rw-r--r--  1   262144000 Oct 12 16:58 copied_with_php_copy.txt
-rw-r--r--  1   262144000 Oct 12 16:58 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 350MB dummy file.
367001600 bytes transferred in 5.770454 secs (63600127 bytes/sec)
copy() took: 10s
cp took: 0s
-rw-r--r--  1   367001600 Oct 12 16:59 copied_with_cp.txt
-rw-r--r--  1   367001600 Oct 12 16:59 copied_with_php_copy.txt
-rw-r--r--  1   367001600 Oct 12 16:59 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 400MB dummy file.
419430400 bytes transferred in 7.365670 secs (56943956 bytes/sec)
copy() took: 11s
cp took: 1s
-rw-r--r--  1   419430400 Oct 12 16:59 copied_with_cp.txt
-rw-r--r--  1   419430400 Oct 12 16:59 copied_with_php_copy.txt
-rw-r--r--  1   419430400 Oct 12 16:59 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 500MB dummy file.
524288000 bytes transferred in 8.256745 secs (63498145 bytes/sec)
copy() took: 14s
cp took: 1s
-rw-r--r--  1   524288000 Oct 12 16:59 copied_with_cp.txt
-rw-r--r--  1   524288000 Oct 12 16:59 copied_with_php_copy.txt
-rw-r--r--  1   524288000 Oct 12 16:59 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 1000MB dummy file.
1048576000 bytes transferred in 25.389519 secs (41299561 bytes/sec)
copy() took: 18s
cp took: 2s
-rw-r--r--  1   1048576000 Oct 12 17:00 copied_with_cp.txt
-rw-r--r--  1   1048576000 Oct 12 17:00 copied_with_php_copy.txt
-rw-r--r--  1   1048576000 Oct 12 17:00 dummy.txt

Removing test files.
----------------------------------------------------------------------



sapi/cli/php blah.php (MODIFIED streams.c): 

Generating 1MB dummy file.
1048576 bytes transferred in 0.018799 secs (55778089 bytes/sec)
copy() took: 0s
cp took: 0s
-rw-r--r--  1   1048576 Oct 12 16:57 copied_with_cp.txt
-rw-r--r--  1   1048576 Oct 12 16:57 copied_with_php_copy.txt
-rw-r--r--  1   1048576 Oct 12 16:57 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 2MB dummy file.
2097152 bytes transferred in 0.039626 secs (52923792 bytes/sec)
copy() took: 0s
cp took: 0s
-rw-r--r--  1   2097152 Oct 12 16:57 copied_with_cp.txt
-rw-r--r--  1   2097152 Oct 12 16:57 copied_with_php_copy.txt
-rw-r--r--  1   2097152 Oct 12 16:57 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 5MB dummy file.
5242880 bytes transferred in 0.088615 secs (59164582 bytes/sec)
copy() took: 0s
cp took: 0s
-rw-r--r--  1   5242880 Oct 12 16:57 copied_with_cp.txt
-rw-r--r--  1   5242880 Oct 12 16:57 copied_with_php_copy.txt
-rw-r--r--  1   5242880 Oct 12 16:57 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 10MB dummy file.
10485760 bytes transferred in 0.165203 secs (63471935 bytes/sec)
copy() took: 0s
cp took: 0s
-rw-r--r--  1   10485760 Oct 12 16:57 copied_with_cp.txt
-rw-r--r--  1   10485760 Oct 12 16:57 copied_with_php_copy.txt
-rw-r--r--  1   10485760 Oct 12 16:57 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 15MB dummy file.
15728640 bytes transferred in 0.253879 secs (61953334 bytes/sec)
copy() took: 1s
cp took: 0s
-rw-r--r--  1   15728640 Oct 12 16:57 copied_with_cp.txt
-rw-r--r--  1   15728640 Oct 12 16:57 copied_with_php_copy.txt
-rw-r--r--  1   15728640 Oct 12 16:57 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 19MB dummy file.
19922944 bytes transferred in 0.316122 secs (63022949 bytes/sec)
copy() took: 0s
cp took: 0s
-rw-r--r--  1   19922944 Oct 12 16:57 copied_with_cp.txt
-rw-r--r--  1   19922944 Oct 12 16:57 copied_with_php_copy.txt
-rw-r--r--  1   19922944 Oct 12 16:57 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 20MB dummy file.
20971520 bytes transferred in 0.326075 secs (64315004 bytes/sec)
copy() took: 1s
cp took: 0s
-rw-r--r--  1   20971520 Oct 12 16:57 copied_with_cp.txt
-rw-r--r--  1   20971520 Oct 12 16:57 copied_with_php_copy.txt
-rw-r--r--  1   20971520 Oct 12 16:57 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 50MB dummy file.
52428800 bytes transferred in 0.813434 secs (64453671 bytes/sec)
copy() took: 1s
cp took: 0s
-rw-r--r--  1   52428800 Oct 12 16:57 copied_with_cp.txt
-rw-r--r--  1   52428800 Oct 12 16:57 copied_with_php_copy.txt
-rw-r--r--  1   52428800 Oct 12 16:57 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 100MB dummy file.
104857600 bytes transferred in 1.612340 secs (65034423 bytes/sec)
copy() took: 1s
cp took: 0s
-rw-r--r--  1   104857600 Oct 12 16:57 copied_with_cp.txt
-rw-r--r--  1   104857600 Oct 12 16:57 copied_with_php_copy.txt
-rw-r--r--  1   104857600 Oct 12 16:57 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 250MB dummy file.
262144000 bytes transferred in 4.026581 secs (65103371 bytes/sec)
copy() took: 1s
cp took: 0s
-rw-r--r--  1   262144000 Oct 12 16:57 copied_with_cp.txt
-rw-r--r--  1   262144000 Oct 12 16:57 copied_with_php_copy.txt
-rw-r--r--  1   262144000 Oct 12 16:57 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 350MB dummy file.
367001600 bytes transferred in 5.656869 secs (64877162 bytes/sec)
copy() took: 1s
cp took: 1s
-rw-r--r--  1   367001600 Oct 12 16:57 copied_with_cp.txt
-rw-r--r--  1   367001600 Oct 12 16:57 copied_with_php_copy.txt
-rw-r--r--  1   367001600 Oct 12 16:57 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 400MB dummy file.
419430400 bytes transferred in 6.443685 secs (65091700 bytes/sec)
copy() took: 2s
cp took: 0s
-rw-r--r--  1   419430400 Oct 12 16:57 copied_with_cp.txt
-rw-r--r--  1   419430400 Oct 12 16:57 copied_with_php_copy.txt
-rw-r--r--  1   419430400 Oct 12 16:57 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 500MB dummy file.
524288000 bytes transferred in 8.896159 secs (58934199 bytes/sec)
copy() took: 2s
cp took: 1s
-rw-r--r--  1   524288000 Oct 12 16:58 copied_with_cp.txt
-rw-r--r--  1   524288000 Oct 12 16:58 copied_with_php_copy.txt
-rw-r--r--  1   524288000 Oct 12 16:58 dummy.txt

Removing test files.
----------------------------------------------------------------------
Generating 1000MB dummy file.
1048576000 bytes transferred in 16.196600 secs (64740501 bytes/sec)
copy() took: 6s
cp took: 3s
-rw-r--r--  1   1048576000 Oct 12 16:58 copied_with_cp.txt
-rw-r--r--  1   1048576000 Oct 12 16:58 copied_with_php_copy.txt
-rw-r--r--  1   1048576000 Oct 12 16:58 dummy.txt

Removing test files.
----------------------------------------------------------------------
 [2016-10-12 23:30 UTC] jeffreydwalter at gmail dot com
Here's a little better version of my original test script:

<?php

define('FILE_SIZE_IN_MB', 100);
foreach([1,2,5,10,15, 19, 20,50,100,250,350,400,500,1000] as $file_size) {

    printf('Generating %sMB dummy file.'.PHP_EOL, $file_size);
    if(!file_exists("dummy.$file_size.txt")) {
        echo shell_exec('dd if=/dev/urandom of=dummy.'.$file_size.'.txt bs=1048576 count='.$file_size.' 2>&1 | grep -v records');
    }

    $copy_start = time();
    copy("dummy.$file_size.txt", "copied_with_php_copy.$file_size.txt");
    $copy_done = time();

    $cp_start = time();
    exec("cp dummy.$file_size.txt copied_with_cp.$file_size.txt");
    $cp_done = time();

    echo 'copy() took: '.($copy_done-$copy_start).'s'.PHP_EOL;
    echo 'cp took: '.($cp_done-$cp_start).'s'.PHP_EOL;

    echo 'Removing test files.'.PHP_EOL;
    if(isset($argv[1]) and $argv[1] == 'clean') {
        exec('rm -rf dummy.*.txt ');
    }
    exec('rm -rf copied_with_php_copy.*.txt copied_with_cp.*.txt');
    echo '----------------------------------------------------------------------'.PHP_EOL;
}

echo shell_exec('ls -al dummy.*.txt copied_with_php_copy.*.txt copied_with_cp.*.txt').PHP_EOL;
 [2019-10-30 12:12 UTC] nikic@php.net
-Status: Open +Status: Closed -Assigned To: +Assigned To: nikic
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 19 21:01:30 2024 UTC