php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #36365 scandir duplicates file name at every 65535th file
Submitted: 2006-02-11 03:17 UTC Modified: 2020-04-23 09:38 UTC
Votes:4
Avg. Score:4.2 ± 0.8
Reproduced:2 of 2 (100.0%)
Same Version:0 (0.0%)
Same OS:1 (50.0%)
From: pilhoon at gmail dot com Assigned: cmb (profile)
Status: Closed Package: Directory function related
PHP Version: * OS: Windows only
Private report: No CVE-ID: None
 [2006-02-11 03:17 UTC] pilhoon at gmail dot com
Description:
------------
scandir duplicates file name at every 65535th file

Reproduce code:
---------------
I made over 260000 files in a folder.
Their names are 'f100001' ... 'f264001'.

$file_names = scandir('files/');

$base_names= array();
for($i=100001; $i<=263064; $i++)
{
	$base_names["f".$i] = 0;
}

foreach($file_names as $a_name)
{
	if(1 == $base_names[$a_name])
		echo $a_name."\n";
	else
		$base_names[$a_name] = 1;
}

Expected result:
----------------
They must be showed only once.

Actual result:
--------------
scandir(THAT_FOLDER) returns large array but
f165534 and f231069 are duplicated.



Patches

Pull Requests

Pull requests:

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-02-11 13:21 UTC] sniper@php.net
Please try using this CVS snapshot:

  http://snaps.php.net/php5.1-latest.tar.gz
 
For Windows:
 
  http://snaps.php.net/win32/php5.1-win32-latest.zip


 [2006-02-19 01:00 UTC] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
 [2010-01-07 18:02 UTC] enzo at smshome dot net
Same problem here using PHP Version 5.2.5 (x64) with Windows XP 64bit Professional Edition.

Duplicated files every 65535 files read using scandir(). I think you should put an alert on manual page:
http://php.net/manual/en/function.scandir.php
 [2010-01-07 18:26 UTC] pajoye@php.net
Please try using this snapshot:

  http://snaps.php.net/php5.3-latest.tar.gz
 
For Windows:

  http://windows.php.net/snapshots/


 [2010-01-15 01:00 UTC] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
 [2012-09-08 03:24 UTC] gobie at centrum dot cz
Same problem reproduced
PHP Version 5.4.4 Windows 7 x64
PHP Version 5.3.13 Windows 7 x32

But all PHP methods listing files are affected scandir, glob, DirectoryIterator.

Reproduce code:
---------------
// Settings
$dir = './test/';
$totalFiles = 1e5;

// Create empty files
!is_dir($dir) && mkdir($dir);
for ($i = 0; $i < $totalFiles; ++$i) {
    $filename = $dir . str_pad($i, 6, '0', STR_PAD_LEFT);
    touch($filename);
}

// Glob
$files = glob($dir . '*');
echo 'glob: ' . count($files) . '/' . $totalFiles . PHP_EOL;

// Scandir
$files = scandir($dir);
echo 'scandir: ' . (count($files) - 2) . '/' . $totalFiles . PHP_EOL; // . and 
..

// DirectoryIterator
$it = new DirectoryIterator($dir);
echo 'DirectoryIterator: ' . (iterator_count($it) - 2) . '/' . $totalFiles . 
PHP_EOL; // . and ..
unset($it);

Expected result:
----------------
glob: 100000/100000
scandir: 100000/100000
DirectoryIterator: 100000/100000

Actual result:
--------------
glob: 100001/100000
scandir: 100001/100000
DirectoryIterator: 100001/100000
 [2012-09-08 09:24 UTC] pajoye@php.net
-Status: No Feedback +Status: Assigned -PHP Version: 5.1.2 +PHP Version: * -Assigned To: +Assigned To: pajoye
 [2015-06-07 23:51 UTC] cmb@php.net
The problem is in readdir_r()[1]. If dp->offset != 0
FindNextFile() is called, then dp->offset is increased. Finally,
the directory entry is composed and "returned". However,
dp->offset is a short int[2], so once for every 65535 entries
FindNextFile() is not called, but a directory entry is added
nonetheless. Furthermore, dp->dent.d_off is not correct for high
values.

[1] <https://github.com/php/php-src/blob/php-5.6.9/win32/readdir.c#L92-L118>
[2] <https://github.com/php/php-src/blob/php-5.6.9/win32/readdir.h#L35>
 [2015-07-23 05:59 UTC] cbader92 at gmail dot com
Can reproduce bug with PHP 5.5.12 (cli) (built: Apr 30 2014 11:20:58) in Windows 7 Home Premium x64, build 7601.

I thought I was going crazy. My 100,000 files generated an array with 100,001 indexes. Took forever to find out why my script was trying to copy more files than there were in the directory.
 [2015-07-23 12:59 UTC] cmb@php.net
-Status: Assigned +Status: Analyzed
 [2015-07-23 12:59 UTC] cmb@php.net
I've made a respective PR.
 [2015-07-23 22:22 UTC] cmb@php.net
-Operating System: windows +Operating System: Windows only
 [2015-07-27 23:07 UTC] cmb@php.net
Automatic comment on behalf of cmb
Revision: http://git.php.net/?p=php-src.git;a=commit;h=4e8f01cb6e75e09ad7ffa5201c42d1bea2a61264
Log: Fix #36365: scandir duplicates file name at every 65535th file
 [2015-07-27 23:07 UTC] cmb@php.net
-Status: Analyzed +Status: Closed
 [2015-08-04 20:54 UTC] ab@php.net
Automatic comment on behalf of cmb
Revision: http://git.php.net/?p=php-src.git;a=commit;h=4e8f01cb6e75e09ad7ffa5201c42d1bea2a61264
Log: Fix #36365: scandir duplicates file name at every 65535th file
 [2016-07-20 11:37 UTC] davey@php.net
Automatic comment on behalf of cmb
Revision: http://git.php.net/?p=php-src.git;a=commit;h=4e8f01cb6e75e09ad7ffa5201c42d1bea2a61264
Log: Fix #36365: scandir duplicates file name at every 65535th file
 [2019-10-15 07:54 UTC] cmb@php.net
-Status: Closed +Status: Re-Opened -Assigned To: pajoye +Assigned To: cmb
 [2019-10-15 07:54 UTC] cmb@php.net
Re-opening, since this issue has been re-introduced by commit
758af77[1] affecting PHP 7.2 and up.

[1] <http://git.php.net/?p=php-src.git;a=commit;h=758af77e9d1c3c6e5aea365bc0d35c385278ad5a>
 [2020-04-23 08:51 UTC] michael dot vorisek at email dot cz
Any progress on this and can this be added to tests when run on Windows?
 [2020-04-23 09:38 UTC] cmb@php.net
Unfortunately, this cannot be fixed in a PHP revision, because it
requires to change the definition of struct DIR_W32[1], what would
constitute an ABI break.

I'll check whether a test would be viable (it'll likely be very
slow).

[1] <https://github.com/php/php-src/blob/php-7.4.5/win32/readdir.h#L27-L34>
 [2020-04-23 15:08 UTC] cmb@php.net
The following pull request has been associated:

Patch Name: Fix #36365: scandir duplicates file name at every 65535th file
On GitHub:  https://github.com/php/php-src/pull/5439
Patch:      https://github.com/php/php-src/pull/5439.patch
 [2020-04-24 07:48 UTC] cmb@php.net
Automatic comment on behalf of cmbecker69@gmx.de
Revision: http://git.php.net/?p=php-src.git;a=commit;h=767a77ac19af1192aa8b674d62f75b08abb199d6
Log: Fix #36365: scandir duplicates file name at every 65535th file
 [2020-04-24 07:48 UTC] cmb@php.net
-Status: Re-Opened +Status: Closed
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 08:01:29 2024 UTC