|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2017-07-14 19:52 UTC] furun at arcor dot de
Description:
------------
UTF8 characters wrong processed in file system
PHP 7.1.7, Win7, Xampp
PHP 7.1.5, Linux
PHP has problems to process UTF8 characters in files correctly.
there are maybe multiple buggy file-functions, but i tested: file_put_contents, file_get_contents, glob.
file_put_contents, file_get_contents writes and reads files correctly,
but glob don't list them correctly.
(is there a .htaccess or PHP or system option to fix this, or it is a PHP-Code problem? i think it is a PHP bug.)
(PHP7 has still a very weak support of UFT8 encoding? maybe all file functions should be tested?)
Results:
in both OS systems (Windows 7 and Linux), the files are created correctly.
but they are not read back correctly with glob(),
file_put_contents, file_get_contents (OK):
!äÄ
ä!Ä
äÄ!
ä!Ä.txt
äÄ.txt
ÿ.txt
Āﻼ.txt
أ¤أ„.tx2
glob() (BUG):
PHP_OS: WINNT
ÿ.txt 8
ä!Ä.txt 9
äÄ.txt 8
Āﻼ.txt 9
أ¤أ„.tx2 13
PHP_OS: Linux
.txt 4
!Ä.txt 7
.txt 4
.txt 4
.tx2 4
Test script:
---------------
define('DIR_BASE', realpath(dirname(__FILE__) . DIRECTORY_SEPARATOR) . DIRECTORY_SEPARATOR);
define('DIR_TEMP', DIR_BASE . 'temp' . DIRECTORY_SEPARATOR);
print(DIR_BASE . '<br>');
print(DIR_TEMP . '<br>');
print('<br>');
print('PHP_OS: ' . PHP_OS . '<br><br>');
try {
if (! file_exists(DIR_TEMP) && ! is_dir(DIR_TEMP)) {
$check = mkdir(DIR_TEMP, 0755);
if ($check) $check = chmod(DIR_TEMP, 0755);
}
} catch (Exception $exception) {
}
print('file_put_contents, file_get_contents:<br>');
$fileName = "!äÄ";
file_put_contents(DIR_TEMP . $fileName, $fileName);
$data = file_get_contents(DIR_TEMP . $fileName);
print($data . '<br>');
$fileName = "ä!Ä";
file_put_contents(DIR_TEMP . $fileName, $fileName);
$data = file_get_contents(DIR_TEMP . $fileName);
print($data . '<br>');
$fileName = "äÄ!";
file_put_contents(DIR_TEMP . $fileName, $fileName);
$data = file_get_contents(DIR_TEMP . $fileName);
print($data . '<br>');
$fileName = "ä!Ä.txt";
file_put_contents(DIR_TEMP . $fileName, $fileName);
$data = file_get_contents(DIR_TEMP . $fileName);
print($data . '<br>');
$fileName = "äÄ.txt";
file_put_contents(DIR_TEMP . $fileName, $fileName);
$data = file_get_contents(DIR_TEMP . $fileName);
print($data . '<br>');
$fileName = " ÿ.txt";
file_put_contents(DIR_TEMP . $fileName, $fileName);
$data = file_get_contents(DIR_TEMP . $fileName);
print($data . '<br>');
$fileName = "Āﻼ.txt";
file_put_contents(DIR_TEMP . $fileName, $fileName);
$data = file_get_contents(DIR_TEMP . $fileName);
print($data . '<br>');
$fileName = "äÄ.tx2";
$fileName = iconv('windows-1256', 'utf-8', $fileName);
file_put_contents(DIR_TEMP . $fileName, $fileName);
$data = file_get_contents(DIR_TEMP . $fileName);
print($data . '<br>');
$data = '';
$data .= '<table class="text documentlist sortable">' . "\n";
$data .= '<tbody>' . "\n";
print('<br><br>glob:<br>');
$fileList = glob(DIR_TEMP . '*.*', 0);
foreach ($fileList as $filePath) {
$fileName = basename($filePath);
$data .= '<tr">' .
'<td>' . '<a href="temp/' . htmlentities(urlencode($fileName)) . '">' . $fileName . '</a></td>'.
'<td>' . strlen($fileName) . '</td>'.
'</tr>' . "\n"; //FIXME Einlesen
}
$data .= '</tbody>' . "\n";
$data .= '</table>' . "\n";
print($data);
Expected result:
----------------
all files listed in correct names
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Sat Oct 25 14:00:01 2025 UTC |
What "linux" system are you testing on? If it's the alleged Linux sub-system on Windows 10, you are very likely to just be seeing Windows issues come through the alleged VM. Testing on Centos with the simpler code below, doesn't show any problems, exception that glob() doesn't return any files starting with an exclamation mark. <?php define('DIR_TEMP', __DIR__ . '/test/'); @mkdir(DIR_TEMP, 0755, true); print('PHP_OS: ' . PHP_OS . '<br><br>'); $filenames = [ "!äÄ", "!Hello_I_start_with_an_exclamation", " ÿ.txt", "Āﻼ.txt", "ä!Ä.txt", "äÄ.txt", iconv('windows-1256', 'utf-8', "äÄ.tx2") ]; foreach ($filenames as $filename) { $written = file_put_contents(DIR_TEMP . $filename, 'foo'); if ($written === false) { echo "Failed to write file $filename \n"; } if (file_exists(DIR_TEMP . $filename) === false) { echo "written but doesn't exist?\n"; } } var_dump($filenames); $fileList = glob(DIR_TEMP . '*.*', 0); var_dump($fileList); PHP_OS: Linux<br><br>array(7) { [0]=> string(5) "!äÄ" [1]=> string(34) "!Hello_I_start_with_an_exclamation" [2]=> string(7) " ÿ.txt" [3]=> string(9) "Āﻼ.txt" [4]=> string(9) "ä!Ä.txt" [5]=> string(8) "äÄ.txt" [6]=> string(13) "أ¤أ„.tx2" } array(5) { [0]=> string(58) "/testing/test/ ÿ.txt" [1]=> string(60) "/testing/test/Āﻼ.txt" [2]=> string(61) "/testing/test/ä!Ä.txt" [3]=> string(60) "/testing/test/äÄ.txt" [4]=> string(67) "/testing/test/أ¤أ„.tx2" }