php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #64699 is_dir() is inaccurate result on Windows with japanese locale.
Submitted: 2013-04-23 15:48 UTC Modified: 2016-08-08 10:02 UTC
Votes:4
Avg. Score:4.0 ± 1.0
Reproduced:3 of 3 (100.0%)
Same Version:3 (100.0%)
Same OS:3 (100.0%)
From: sharkpp at gmail dot com Assigned: ab (profile)
Status: Closed Package: Filesystem function related
PHP Version: 5.4.14 OS: Windows
Private report: No CVE-ID: None
 [2013-04-23 15:48 UTC] sharkpp at gmail dot com
Description:
------------
Environment

I'm testing this problem on Windows 7 Ultimate x86 english.

Configuration changes little is required to reproduce.

1. Please open "Control Panel".
2. Please click "Change display language" link.
3. Please select "Administrative" tab and click "Change system locale..." 
button.
4. Please change current system locale "Japanese(Japan)".

The above procedure is not needed if you want to try in the Japanese versions of 
Windows.

php in immediately after installation, the default state is also php.ini (does 
not exist).

Problem

is_dir() will lie If you create a folder that contains the "\x5C" in the string.
It may YYYY is included in the second byte of Shift_JIS.
For example ソ(\x83\x5C).
More example: 
https://ja.wikipedia.org/wiki/Shift_JIS (japanese)


Test script:
---------------
@mkdir("a");
@mkdir("\x83\x5D");
@mkdir("\x83\x5C");

$dir = './';
if ($dh = opendir($dir)) {
    while (($file = readdir($dh)) !== false) {
        $path = $dir . $file;
        $type = filetype($path);
        $type2= is_dir($path) ? 'dir' : 'file';
        $comp = $type == $type2 ? 'OK' : 'NG';
        echo "filetype()[".str_pad($type, 4)."] == is_dir()[".str_pad($type2, 4)."] -> $comp: {$file}\n";
    }
    closedir($dh);
}


Expected result:
----------------
filetype()[dir ] == is_dir()[dir ] -> OK: .
filetype()[dir ] == is_dir()[dir ] -> OK: ..
filetype()[dir ] == is_dir()[dir ] -> OK: a
filetype()[file] == is_dir()[file] -> OK: test.php
filetype()[dir ] == is_dir()[file] -> NG: ソ
filetype()[dir ] == is_dir()[dir ] -> OK: ゾ


Actual result:
--------------
filetype()[dir ] == is_dir()[dir ] -> OK: .
filetype()[dir ] == is_dir()[dir ] -> OK: ..
filetype()[dir ] == is_dir()[dir ] -> OK: a
filetype()[file] == is_dir()[file] -> OK: test.php
filetype()[dir ] == is_dir()[dir ] -> OK: ソ
filetype()[dir ] == is_dir()[dir ] -> OK: ゾ


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2013-05-07 06:36 UTC] ku at digitaldolphins dot jp
Hi.

It is known problem. And it won't be fixed.

If you need a patch, check my one at:

https://bugs.php.net/bug.php?id=61315

Or you can try php-wfio extension instead.

https://code.google.com/p/php-wfio/

It needs to be built manually with step by step instruction. 

https://wiki.php.net/internals/windows/stepbystepbuild

Try at your own risk!

Thanks
kenji uno.
 [2013-05-07 06:54 UTC] pajoye@php.net
For the record, it is not that it won't be fixed but can't be fixed at this stage 
but in a major version. Not only PHP's code and not only for file stream wrapper.
 [2013-05-25 04:28 UTC] sharkpp at gmail dot com
Thank you.
I hope a major version.

Incidentally, php-wfio is not work.
Because is_dir() is not implemented.
 [2013-08-06 01:48 UTC] ku at digitaldolphins dot jp
Ah, sorry...

---
Insert the prefix: wfio://

is_dir("wfio://C:/")

is_dir("wfio://C:\\")

---
It will list entries in Shift-JIS charset, with Japanese Windows.

php.exe -r "print_r(scandir('C:/'));"

---
It will list entries in UTF8.

php.exe -r "print_r(scandir('wfio://C:/'));"

---
"wfio://" may support:
fopen, fwrite, fread, stat, fclose,
opendir, readdir, closedir,
rename, copy, unlink,
mkdir, rmdir.

Here is first post about php-wfio.

http://news.php.net/php.windows/30987

Thanks
 [2014-02-02 14:39 UTC] severnuri at yahoo dot com
Similar problem with directory names having Turkish letters şŞıİğĞ
using Windows 8.1 Pro english
I was calling is_dir() on entries returned by readdir() in a script, and noticed that readdir() was transliterating names containing such letters, hence is_dir() was always failing.
I assume this is the same issue so did not open another bug. Please advise if I should file a new bug for this.
 [2016-08-08 10:02 UTC] ab@php.net
-Status: Open +Status: Closed -Assigned To: +Assigned To: ab
 [2016-08-08 10:02 UTC] ab@php.net
Fixed in PHP 7.1, please read UPGRADING and use UTF-8.

Thanks.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Dec 24 03:01:28 2024 UTC