php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #78094 File Search Problem Excessive Time
Submitted: 2019-05-31 17:09 UTC Modified: 2019-06-03 16:36 UTC
From: dpfender44 at gmail dot com Assigned: ab (profile)
Status: Closed Package: *Directory/Filesystem functions
PHP Version: 7.4Git-2019-06-02 (snap) OS: Windows 10
Private report: No CVE-ID: None
 [2019-05-31 17:09 UTC] dpfender44 at gmail dot com
Description:
------------
My function scan_dir() uses opendir(), chdir(), readdir(), is_dir(), closedir() to scan for files on the server.
PHP version 7.4.0-dev does the same scan as version 7.3.6 but takes about 100 times longer.
This is all done on the same server, but just switching between PHP versions on 5/31/19.

The extra time seems to be spent doing readdir().

Test script:
---------------
http://djpsrc.djpnet.dyndns.org/test_scan_dir.php.txt

Expected result:
----------------
PHP version 7.3.6

Operating System Microsoft Windows [Version 10.0.17763.503]

File search types: .mp4|.webm|.mp3|.wav|.mid|.txt

Units of seconds (Unix timestamp from microtime()):
Start file search time (1559247589.134)
End file search time (1559247589.1699)
Elapsed time (0.035879135131836)

Resulting file match count (777)


Actual result:
--------------
PHP version 7.4.0-dev

Operating System Microsoft Windows [Version 10.0.17763.503]

File search types: .mp4|.webm|.mp3|.wav|.mid|.txt

Units of seconds (Unix timestamp from microtime()):
Start file search time (1559320102.0205)
End file search time (1559320105.3702)
Elapsed time (3.3497362136841)

Resulting file match count (777)


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-06-01 11:14 UTC] ab@php.net
-Status: Open +Status: Feedback
 [2019-06-01 11:14 UTC] ab@php.net
Thanks for the report. Could you please reduce the repro snippet to the absolutely required lines, only Windows related, only issue related?

Thanks.
 [2019-06-01 15:00 UTC] dpfender44 at gmail dot com
-Status: Feedback +Status: Open
 [2019-06-01 15:00 UTC] dpfender44 at gmail dot com
All of the filesystem functions take longer than in PHP 7.3.6 and is_dir() is the worst.

Elapsed time for a recursive file search over several directories is in units of seconds.
Time values are derived from using microtime() before and after each system function is used.
Total count of files returned is 777;

PHP 7.3.6

getcwd total	0.00004625
is_dir total	0.02332211
opendir total	0.00398731
readdir total	0.00258327
chdir total	0.00327039
closedir total	0.00072503
Total Elapsed	0.03843498

PHP 7.4.0-dev

getcwd total	0 00007987
is_dir total	3.32146049
opendir total	0.00476623
readdir total	0.00765395
chdir total	0.00488377
closedir total	0.00070310
Total Elapsed	3.35602283
 [2019-06-02 11:51 UTC] cmb@php.net
-Status: Open +Status: Verified -PHP Version: Next Major Version +PHP Version: 7.4Git-2019-06-02 (snap)
 [2019-06-02 11:51 UTC] cmb@php.net
Thanks for reporting this issue.  I can confirm a severe
performance regression with the following script:

    <?php
    $start = hrtime(true);
    $count = scandirrec('D:/git/php/php-src/ext');
    $end = hrtime(true);
    printf("%d files found in %f sec\n", $count, ($end-$start)/1e9);

    function scandirrec($dir)
    {
        $count = 0;
        $owd = getcwd();
        chdir($dir);
        $files = scandir('.');
        foreach ($files as $file) {
            if ($file[0] === '.') continue;
            if (is_dir($file)) {
                $count += scandirrec("$dir/$file");
            } else {
                $count++;
            }
        }
        chdir($owd);
        return $count;
    }
    ?>

Running with php-7.3.6-nts-Win32-VC15-x64, I get something like:

    14105 files found in 0.245927 sec

but with php-7.4-nts-windows-vs16-x64-r7a64150 and
php-7.4-nts-windows-vc15-x64-re2f8d90 (latest VC 15 snap),
something like:

    14105 files found in 1.028066 sec

(all three PHP versions running without php.ini)
 [2019-06-03 16:36 UTC] cmb@php.net
-Assigned To: +Assigned To: ab
 [2019-06-03 16:36 UTC] cmb@php.net
Further findings:

* the timings above have been measured after multiple preparation
runs; the first run with PHP 7.3 usually took a few seconds; the
first run with PHP 7.4 often took about three minutes

* the perfomance regression has been introduced with commit
e42e8b1[1]

* this commit introduced a GetFileInformationByHandle() call,
which requires to retrieve a file handle with CreateFile(); this
is likely to take longer than the former GetFileAttributesEx()
call, but delivers more interesting information (such as ino and
nlink)

* this commit also introduced a call to GetBinaryType(), and it
seems that this call slows down the whole stat'ing considerably;
after removing the call the performance appears to always be
roughly the same as with PHP 7.3 (well, still a slower, but
not by orders of magnitude)

Anatol, what do you think about replacing the GetBinaryType() call
with the old file extension check?

[1] <http://git.php.net/?p=php-src.git;a=commit;h=e42e8b1051a8abeaa8e6053653a4ff43438766e2>
 [2019-06-06 13:57 UTC] cmb@php.net
Automatic comment on behalf of cmbecker69@gmx.de
Revision: http://git.php.net/?p=php-src.git;a=commit;h=f5b44c7e8a55978b5e3c3511b310ab8f09beaa97
Log: Fix bug #78094: File Search Problem Excessive Time
 [2019-06-06 13:57 UTC] cmb@php.net
-Status: Verified +Status: Closed
 
PHP Copyright © 2001-2021 The PHP Group
All rights reserved.
Last updated: Thu Jan 21 16:01:23 2021 UTC