php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #79633 readdir skips entries
Submitted: 2020-05-26 02:03 UTC Modified: 2020-06-07 04:22 UTC
From: keith at ksmith dot com Assigned: cmb (profile)
Status: No Feedback Package: Directory function related
PHP Version: Irrelevant OS: Ubuntu 18.04, Others
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2020-05-26 02:03 UTC] keith at ksmith dot com
Description:
------------
root@nas-2:/volume1/admin/bin# mkdir /volume1/media/testfolder/0
root@nas-2:/volume1/admin/bin# mkdir /volume1/media/testfolder/1
root@nas-2:/volume1/admin/bin# mkdir /volume1/media/testfolder/2
root@nas-2:/volume1/admin/bin# mkdir /volume1/media/testfolder/3
root@nas-2:/volume1/admin/bin# mkdir /volume1/media/testfolder/4
root@nas-2:/volume1/admin/bin# mkdir /volume1/media/testfolder/5
root@nas-2:/volume1/admin/bin# ls /volume1/media/testfolder
0  1  2  3  4  5
root@nas-2:/volume1/admin/bin# php PHP-test.php 
Array
(
    [0] => ..
    [1] => 1
    [2] => 3
    [3] => 4
    [4] => 5
)
See Code below

Test script:
---------------
#!/usr/bin/php
<?php
$cur_folder = "/volume1/media/testfolder";
$dirp = opendir($cur_folder);
while($entry = readdir($dirp)) {
    $entry_list[] = $entry;
}
sort($entry_list);
print_r($entry_list);
?>


Expected result:
----------------
I would expect to see the entries: "." and "2"


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-05-26 02:11 UTC] keith at ksmith dot com
This issue occurs direct, and over NFS.  Tested with various versions of PHP 5 and 7,  available with Debian, and on a Synology NAS.  Extending to the character range [0-9,A-Z] for the folder names leaves out a number of other single letter named folders.

Code indicates function calls php_stream_readdir which in turn calls php_stream_read, rather than the C language readdir function.  I will likely hack up some code to use the 'C' library directly, I'm not sure why that call is so indirect/obfuscated.
 [2020-05-26 05:54 UTC] keith at ksmith dot com
This has something to do with using the "stream" wrapper around the call.  Digging in the code I find "opendir(3)" eventually (about 5 layers down) is (appears to be) called from a #define VCWD_OPENDIR(path) opendir(path).  I'm not familiar with all the indirection on the calls in between, and the requisite #if's for various OS's, versions and such.  I'm guessing there is a small size or length typo or oddity around some buffering.

A test sequence ...

-- Create a folder with 5 numbered directories
-- its on a small XFS filesystem, plenty of free space
[root@vbox-2] /home/keith<319>mkdir -p /foo/bar
[root@vbox-2] /home/keith<320>df -h /foo/bar
Filesystem      Size  Used Avail Use% Mounted on
/dev/md0         56G   34G   23G  60% /
[root@vbox-2] /home/keith<321>cat /proc/mounts | grep md0
/dev/md0 / xfs rw,relatime,attr2,inode64,noquota 0 0
[root@vbox-2] /home/keith<322>mkdir /foo/bar/0 /foo/bar/1 /foo/bar/2 /foo/bar/3 /foo/bar/4 /foo/bar/5
[root@vbox-2] /home/keith<323>ls /foo/bar
0  1  2  3  4  5

-- PHP script/readdir finds no folders (but they are there)
[root@vbox-2] /home/keith<324>/nas-3/admin/bin/PHP-DIR-test.php /foo/bar
Array
(
    [0] => .
    [1] => ..
)
[root@vbox-2] /home/keith<325>ls -l /foo/bar
total 0
drwxr-xr-x 2 root root 6 May 25 22:40 0
drwxr-xr-x 2 root root 6 May 25 22:40 1
drwxr-xr-x 2 root root 6 May 25 22:40 2
drwxr-xr-x 2 root root 6 May 25 22:40 3
drwxr-xr-x 2 root root 6 May 25 22:40 4
drwxr-xr-x 2 root root 6 May 25 22:40 5
[root@vbox-2] /home/keith<326>/nas-3/admin/bin/PHP-DIR-test.php /foo/bar
Array
(
    [0] => .
    [1] => ..
)

-- If we rename the VERY FIRST directory in the folder, magic happens::
[root@vbox-2] /home/keith<327>mv /foo/bar/0 /foo/bar/00
[root@vbox-2] /home/keith<328>/nas-3/admin/bin/PHP-DIR-test.php /foo/bar
Array
(
    [0] => .
    [1] => ..
    [2] => 00
    [3] => 1
    [4] => 2
    [5] => 3
    [6] => 4
    [7] => 5
)

-- And if we put it back it gets weird again
[root@vbox-2] /home/keith<329>mv /foo/bar/00 /foo/bar/0
[root@vbox-2] /home/keith<330>/nas-3/admin/bin/PHP-DIR-test.php /foo/bar
Array
(
    [0] => .
    [1] => ..
    [2] => 1
    [3] => 2
    [4] => 3
    [5] => 4
    [6] => 5
)

-- The script and PHP version.  
[root@vbox-2] /home/keith<331>cat /nas-3/admin/bin/PHP-DIR-test.php 
#!/usr/bin/php
<?php
if ( $argc > 1 ) {
	$cur_folder = $argv[1];
} else {
	$cur_folder = "/volume1/media/testfolder";
}
$dirp = opendir($cur_folder);
while($entry = readdir($dirp)) {
	clearstatcache();
	$path = sprintf("%s/%s",$cur_folder,$entry);
	if(is_dir($path)) {
		$entry_list[] = $entry;
	}
}
sort($entry_list);
print_r($entry_list);
?>
[root@vbox-2] /home/keith<332>php --version
PHP 7.3.14-1~deb10u1 (cli) (built: Feb 16 2020 15:07:23) ( NTS )
Copyright (c) 1997-2018 The PHP Group
Zend Engine v3.3.14, Copyright (c) 1998-2018 Zend Technologies
    with Zend OPcache v7.3.14-1~deb10u1, Copyright (c) 1999-2018, by Zend Technologies
 [2020-05-26 06:45 UTC] cmb@php.net
-Status: Open +Status: Feedback -Assigned To: +Assigned To: cmb
 [2020-05-26 06:45 UTC] cmb@php.net
When readdir() returns "0", the loop is terminated.  Use === to check that the function does not return FALSE.  Would that solve the issue?
 [2020-05-28 03:09 UTC] keith at ksmith dot com
I think your on on to something, but 0/ZERO never shows up.  Since the entries are missing this makes sense, but it was worth a check.

<snip>
for($first_entry = 0;;) {
    $entry = readdir($dirp);
    if($entry === false) {
        printf("Entry is FALSE\n");
        break;
    }
    if($entry === 0) {
        printf("Entry is ZERO(0)\n");
        continue;
    }
....

This Never triggers === 0

So if I'm reading the source correctly.  The code uses "opendir(3)" to grab the DIRP handle.  The code then reads the returned handle as a "stream" directly in blocks of sizeof(struct dirent) and returns the buffer from the read.  I'm old, and this brings back some frightening memories of tricky C coding writing into directory entries to recover things from abandoned inodes, but I digress :).  While this should work, my guess is the "stream" read is not working as expected somehow.  This is definitely an Edge case somewhere.  I just can't pinpoint the edge.  There is a trap to use the opendir call if the stream resource is a "file", but after that I don't see where it saves that in the resource structure anywhere so that one could leverage it at "read" time to force the call out to readdir(3) rather than just reading the "stream".  I think the "stream" is actually a file descriptor, so it could have something to do with blocking as well.  Might be worth looking at the libc code to see how it handles it.

It only seems to occur with entries that only have 1 character.  I didn't try it with files, Just directories, I will try with files as time permits.  It occurs on more than one platform, with more than one filesystem type.  (EXT4, NFS, and XFS as tested) but most of my boxes are debian, other than the syno.  As time permits I will fire up some other machines and run some other tests while I peruse the source (php7.3-7.3.14 from Deb 10.3).  Also if you "dink around" and shuffle the folders around you can make the problem go away.  Which is a work around, sort-of.
 [2020-06-07 04:22 UTC] php-bugs at lists dot php dot net
No feedback was provided. The bug is being suspended because
we assume that you are no longer experiencing the problem.
If this is not the case and you are able to provide the
information that was requested earlier, please do so and
change the status of the bug back to "Re-Opened". Thank you.
 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Tue Nov 24 07:01:24 2020 UTC