|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2020-04-07 14:04 UTC] pguest at meaa dot mea dot com
Description:
------------
A Windows filesystem directory containing the character:
á, 225, 0xE1
is returned from scandir as two characters:
Ã, 195, 0xC3
¡, 161, 0xA1
In otherwords:
'Arquivo dos Gráficos'
becomes:
'Arquivo dos Gráficos'
Test script:
---------------
<?php
$rootdir = 'C:/temp/';
$portuguese_string = 'Arquivo dos Gráficos';
$newdir = $rootdir . $portuguese_string;
if (is_dir($newdir))
{
rmdir ($newdir);
}
if (is_dir($rootdir))
{
rmdir ($rootdir);
}
mkdir($rootdir);
mkdir($newdir);
echo sprintf("'%s' encoding: %s\n", $portuguese_string, mb_detect_encoding($portuguese_string));
$scandir_return = scandir($rootdir);
echo "scandir() returns: " . print_r($scandir_return, true);
echo sprintf("'%s' encoding: %s\n", $scandir_return[2], mb_detect_encoding($scandir_return[2]));
?>
Expected result:
----------------
The expectation is that scandir() should read back the directory string with the same UTF-8 characters involved in its creation.
Actual result:
--------------
'Arquivo dos Gráficos'
becomes:
'Arquivo dos Gráficos'
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Mon Oct 27 11:00:01 2025 UTC |
I cannot reproduce this. Please provide the output of the following script: <?php var_dump( ini_get('internal_encoding'), ini_get('default_charset'), ini_get('zend.multibyte'), sapi_windows_cp_get('ansi'), sapi_windows_cp_get('oem') ); ?>cmb, Here is output from: <?php var_dump( ini_get('internal_encoding'), ini_get('default_charset'), ini_get('zend.multibyte'), sapi_windows_cp_get('ansi'), sapi_windows_cp_get('oem') ); ?> C:\Workspace\PDT\__CORE\php_cmb.php:7: string(0) "" C:\Workspace\PDT\__CORE\php_cmb.php:7: string(5) "UTF-8" C:\Workspace\PDT\__CORE\php_cmb.php:7: string(1) "0" C:\Workspace\PDT\__CORE\php_cmb.php:7: int(1252) C:\Workspace\PDT\__CORE\php_cmb.php:7: int(437)Thanks for the info! So apparently your script is UTF-8 encoded, and you're using the default INI settings, which is supposed to produce the desired filenames, but it works as if there was a call to `sapi_windows_cp_set(1252)` at the beginning of the script. Is there a respective auto_prepend_file? Anyhow, I suppose that adding sapi_windows_cp_set(65001); at the top of the script should enforce the desired behavior. By the way, which SAPI do you use?