php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #37738 basename does not work with Japanese
Submitted: 2006-06-08 06:09 UTC Modified: 2011-04-11 00:19 UTC
Votes:17
Avg. Score:4.6 ± 0.7
Reproduced:13 of 13 (100.0%)
Same Version:6 (46.2%)
Same OS:1 (7.7%)
From: joey at alegria dot co dot jp Assigned:
Status: Not a bug Package: *Directory/Filesystem functions
PHP Version: 5CVS-2006-06-08 (CVS) OS: Fedora Core 4
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: joey at alegria dot co dot jp
New email:
PHP Version: OS:

 

 [2006-06-08 06:09 UTC] joey at alegria dot co dot jp
Description:
------------
Simply put, basename() does ot work with Japanese filepaths. If the filename is Japanese only the extension part of the filename is returned. So a filename "/folder/?t?@?C????.txt" resolves to just ".txt". I discovered the problem when performing a basename() on the $_FILES array's 'name' element for uploaded Japanese files, however after testing the bug occurs no matter how you supply the filename.

My PHP environment is running with UTF-8 internal encoding.

The code snippet below illustrates this perfectly.

Reproduce code:
---------------
<?php
// show normal behavior with roman filename
$filename='/myfolder/roman_filename.txt';
echo "The full filename of the romanized file is $filename.\n"; // /myfolder/roman_filename.txt
$basename=basename($filename);
echo "The basename of the romanized file is $basename.\n"; // /roman_filename.txt
// show behavior with Japanese filename
$filename='/myfolder/???{???̃t?@?C????.txt';
echo "The full filename of the Japanese file is $filename.\n"; // /myfolder/???{???̃t?@?C????.txt
$basename=basename($filename);
echo "The basename of the Japanese file is $basename."; // .txt
?>

Expected result:
----------------
The full filename of the romanized file is /myfolder/roman_filename.txt.
The basename of the romanized file is roman_filename.txt.
The full filename of the Japanese file is /myfolder/???{???̃t?@?C????.txt.
The basename of the Japanese file is ???{???̃t?@?C????.txt.

Actual result:
--------------
The full filename of the romanized file is /myfolder/roman_filename.txt.
The basename of the romanized file is roman_filename.txt.
The full filename of the Japanese file is /myfolder/???{???̃t?@?C????.txt.
The basename of the Japanese file is .txt.

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-06-08 06:37 UTC] derick@php.net
Won't fix in PHP 5. This will be implemented for PHP 6.
 [2011-04-10 22:23 UTC] chx@php.net
-Status: Wont fix +Status: Open
 [2011-04-10 22:23 UTC] chx@php.net
I am reopening this as PHP 6 is not developed any more and the bug is still valid.
 [2011-04-11 00:19 UTC] cataphract@php.net
-Status: Open +Status: Bogus
 [2011-04-11 00:19 UTC] cataphract@php.net
For better or worse, basename is affected by the locale. The encoding used in the filenames must match the set locale.

Bogus.
 [2012-06-06 20:46 UTC] ion at 66 dot ru
we want to use many locales and we don't know locale of filename before user add file. how we can use this filenames without transliteration?
 [2013-09-15 10:20 UTC] kenji dot uui at gmail dot com
On PHP 5.5.1 Windows, I got the results below if I put "setlocale(LC_ALL, 'japanese');".

The full filename of the romanized file is /myfolder/roman_filename.txt.
The basename of the romanized file is roman_filename.txt.
The full filename of the Japanese file is /myfolder/日本語のファイル名.txt.
The basename of the Japanese file is 日本語のファイル名.txt.
 [2013-10-30 15:10 UTC] fleshgrinder at gmx dot at
Just gave the following a test with PHP 5.5.5:

<?php

echo
  basename(__DIR__ . "/english.txt") , PHP_EOL ,
  basename(__DIR__ . "/日本語.txt") , PHP_EOL ,
  basename(__DIR__ . "/ελληνικά.txt") , PHP_EOL ,
  basename(__DIR__ . "/english.txt", "txt") , PHP_EOL ,
  basename(__DIR__ . "/日本語.txt", "txt") , PHP_EOL ,
  basename(__DIR__ . "/ελληνικά.txt", "txt") , PHP_EOL
;

?>

Everything was returned correctly by PHP, LC of the server was set to POSIX.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Nov 24 04:01:32 2024 UTC