|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #75063 Main CWD initialized with wrong codepage
Submitted: 2017-08-11 10:14 UTC Modified: 2017-08-14 14:37 UTC
From: anrdaemon at freemail dot ru Assigned: ab (profile)
Status: Closed Package: Filesystem function related
PHP Version: 7.1.8 OS: Windows
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
Block user comment
Status: Assign to:
Bug Type:
From: anrdaemon at freemail dot ru
New email:
PHP Version: OS:


 [2017-08-11 10:14 UTC] anrdaemon at freemail dot ru
Many filesystem functions that read, write or otherwise deal with file names are unable to work with multibyte encoded filenames, such as when internal_encoding is equal to UTF-8.

Even the example in doesn't work, if a path contains multibyte sequences.

Test script:

print ini_get("internal_encoding") . "\n";
print ini_get("default_charset") . "\n";
foreach(["test", "тест"] as $fn)
  file_put_contents("$fn.txt", "");


Expected result:
$ php -nf ./xx.php

    [0] => test.txt
    [1] => тест.txt

Actual result:
$ php -nf ./xx.php

    [0] => test.txt


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2017-08-11 10:40 UTC]
-Status: Open +Status: Feedback
 [2017-08-11 10:40 UTC]
Works for me. Are you sure you saved the file in UTF-8 format?
 [2017-08-11 12:04 UTC] anrdaemon at freemail dot ru
Weird. Works for me if only filename contains multibyte characters, but breaks when path contains them also, but again, only for multibyte file names.
Can you please try it with parent directory name contaning multibyte characters?

Another interesting finding: getcwd() returns natively encoded filename.

$ php.exe -nf opendir.php 
print file_get_contents(__FILE__) . "-- \n";
print ini_get("internal_encoding") . "\n";
print ini_get("default_charset") . "\n";
print getcwd() . "\n";
print iconv('CP1251', 'UTF-8', getcwd()) . "\n";
foreach(["test", "тест"] as $fn)
  file_put_contents("$fn.txt", "");

if ($dh = opendir(getcwd())) {
    while (($file = readdir($dh)) !== false) {
        echo "filename: $file : filetype: " . filetype($file) . "\n";


filename: . : filetype: dir
filename: .. : filetype: dir
filename: opendir.php : filetype: file
filename: test.txt : filetype: file
filename: UTF-8.php : filetype: file

Warning: filetype(): Lstat failed for тест.txt in C:\dev\temp\тест\opendir.php on line 12
filename: тест.txt : filetype:
 [2017-08-11 13:02 UTC]
-Status: Feedback +Status: Open
 [2017-08-11 13:02 UTC]
I can get the correct result and various different incorrect results, but not your result.

You're running PHP 7.1.8 and not an earlier version? What does echo `chcp` show?
Have you looked at the comments in other similar bug reports (like #74589, #72555, and your earlier #73716) for anything that appears relevant?
 [2017-08-11 15:19 UTC] anrdaemon at freemail dot ru
The bug #72555 (parent of those mentioned) is about dealing with console codepage.
The present issue is strictly internal.
Here, I modified the test script a little and supplied the example output from running it with "php -nf ..."
 [2017-08-12 14:30 UTC]
-Assigned To: +Assigned To: ab
 [2017-08-12 17:51 UTC]
-Status: Assigned +Status: Feedback
 [2017-08-12 17:51 UTC]
@arndaemon which exact build did you use?

From the supplied archive - the folder in the root is shown as "ΓÑßΓ-75063". It might be either a packaging error, or possibly something on the HD that could be a cause. If latter, which kind of FS is that, like local HD, NTFS, Samba, etc.?

 [2017-08-13 13:53 UTC] anrdaemon at freemail dot ru
My apology, it's a zip issue. Totally forgot that there's no standard way to encode file names in zip archives.

The original name was "тест-75063".
I reuploaded the archive for your convenience.

The build I'm using is the 7.1.8 64-bit TS release from

The "output.txt" contains essential part of the phpinfo() (minus credits and environment dump).
If there's anything in the environment, that could make a difference on the results, please let me know and I'll fill the gaps.
 [2017-08-13 22:54 UTC]
Automatic comment on behalf of ab
Log: Fixed bug #75063
 [2017-08-13 22:54 UTC]
-Status: Feedback +Status: Closed
 [2017-08-14 10:21 UTC]
-Summary: Many filesystem-related functions do not work with multibyte file names +Summary: Main CWD initialized with wrong codepage
 [2017-08-14 10:21 UTC]
@arndaemon fixed in dev, the latest snapshots should reflect the fixed state.

 [2017-08-14 14:01 UTC] anrdaemon at freemail dot ru
Thanks, all seems to be working as expected.
At least filetype, getcwd, glob and simplexml_load_file all check out correctly.
 [2017-08-14 14:37 UTC]
Great, thanks for the check. There are discrepancies between TS/NTS by nature, as this case reveals. UTF-8 is still the best option when it comes to the streams I/O, whereby for setups ASCII might be still a safer choice. The fix will likely land in the upcoming RC.

 [2017-08-14 15:32 UTC] anrdaemon at freemail dot ru
For Windows, I'm only concerned about TS variant, and checking it specifically.
For *NIX, the situation is more consistent by nature and I'm not worried… much.
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Jul 21 10:01:30 2024 UTC