php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #37268 basename() doesnt work with non-latin first letters of the file.
Submitted: 2006-05-01 23:56 UTC Modified: 2015-03-12 08:39 UTC
Votes:28
Avg. Score:4.1 ± 1.0
Reproduced:21 of 23 (91.3%)
Same Version:9 (42.9%)
Same OS:6 (28.6%)
From: spam dot bugs dot php dot net at vano dot org Assigned:
Status: Wont fix Package: Filesystem function related
PHP Version: 5.1.2 OS: Fedora Core 4
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: spam dot bugs dot php dot net at vano dot org
New email:
PHP Version: OS:

 

 [2006-05-01 23:56 UTC] spam dot bugs dot php dot net at vano dot org
Description:
------------
If a file starts with a non-latin letter (cyrillic) and not with a non-letter character, basename will cut off the beginning of name untill a latin letter or a non-letter character found in the name.
In my tests I was using Windows-1251 (CP1251) encoding and not UNICODE, so, the problem is in basename() function itself not in multi-byte charsets.


P.S. if you have problem with text encoding of the example below, you can see a realtime example and test your own inputs there:
http://examples.vano.org/basename.php

Reproduce code:
---------------
<?php
echo basename("/test/blah/music/???????latin.mp3");
?>

Expected result:
----------------
???????latin.mp3

Actual result:
--------------
latin.mp3

Patches

UTF8 (last revision 2011-05-18 08:18 UTC by vsz at ya dot ru)

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-05-02 00:55 UTC] judas dot iscariote at gmail dot com
you have to wait PHP6, it will address this problems.
 [2010-07-31 23:21 UTC] daniel at mameso dot com
a workaround:

# workaround for splitting basename whith beginning utf8 multibyte char
function mb_basename($filepath, $suffix = NULL) {
	$splited = preg_split ( '/\//', rtrim ( $filepath, '/ ' ) );
	return substr ( basename ( 'X' . $splited [count ( $splited ) - 1], 
$suffix ), 1 );
}

have fun,
Daniel.

PS: the problem does not exist under MAC OSX 10.6.x with ZendServerCE 5.0 & 
PHP5.3
 [2015-03-12 04:20 UTC] vovan-ve at yandex dot ru
Are you sure this problem will be solved? http://3v4l.org/BWgUm
 [2015-03-12 08:39 UTC] mike@php.net
You need to set the correct locale for stdlib/multibyte functions to work correctly; anyway, 3v4l.org apparently only has C/POSIX locale available:

http://3v4l.org/M6jhm
 [2016-03-19 09:29 UTC] lauri dot kentta at gmail dot com
The default configuration of php-fpm clears the environment, which leads to this problem. Would it be possible to use UTF-8 or even default_charset or at least accept any 8-bit characters if the locale is unset or C? It's a bit strange that the current behaviour can't be controlled through php.ini but instead must be set with setlocale or environment variables.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 18:01:29 2024 UTC