|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2002-09-05 15:46 UTC] gamid at isayev dot net
Functions strtolower() & strtoupper() does not change UTF-8 strings. I try Russian (0x042F, 0x044F) and German (0x00DC, 0x00FC) characters.
Example:
<?
$str = "testЯ";
$loc = "UTF-8";
putenv("LANG=$loc");
$loc = setlocale(LC_ALL, $loc);
$strU = strtoupper($str);
$strL = strtolower($str);
?>
<PRE>
loc = '<? echo $loc; ?>'
str = '<? echo $str; ?>'
strU = '<? echo $strU; ?>'
strL = '<? echo $strL; ?>'
</PRE>
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Thu Oct 30 22:00:01 2025 UTC |
I did backport of strtolower() & strtoupper() from the latest string.c (rev.1.290). The fuctions still does not work: loc = 'UTF-8' str = 'Test?' strU = 'TEST?' strL = 'test?' Since I am not sure that you entered the UTF-8 symbols correctly, here's a modified version of the code that creates the desired test string: <? $str = "Test".utf8_encode("\xFC"); $loc = "UTF-8"; putenv("LANG=$loc"); $loc = setlocale(LC_ALL, $loc); $strU = strtoupper($str); $strL = strtolower($str); ?> <PRE> loc = '<? echo $loc; ?>' str = '<? echo $str; ?>' strU = '<? echo $strU; ?>' strL = '<? echo $strL; ?>' </PRE>> I tried setting LANG/LC_ALL to that and it indeed > didn't work. When I set those to "en_US" it works just fine. What you mean "works just fine"? Did it convert 0xC39C ('?' in UTF-8 encoding) into 0xC3BC ('?' in UTF-8 encoding)? Or 0xD0AF (Russian capital "ya" in UTF-8 encoding) into 0xD18F (Russian lowercase "ya" in UTF-8 encoding)?This is not a bug in PHP; it's down to whether your system can support this and has the appropriate locales installed. A quick and dirty example might look this this in C: #include <ctype.h> main() { char buff[1024]; while(fgets(buff, sizeof(buff), stdin)) { int i, l; l = strlen(buff); for (i = 0; i < l; i++) buff[i] = toupper(buff[i]); puts(buff); } } If that little program works, your system supports this conversion. If it doesn't, then PHP doesn't either.As I understand toupper()/tolower() are working only for one byte encodings. So right way is to use 'wide' versions of toupper()/tolower() - towupper()/towlower(). Example: #include <stdio.h> #include <wctype.h> #include <locale.h> int main() { printf("locale set to '%s'\n", setlocale(LC_ALL, "UTF-8")); printf("0x00DC C='%C'\n", towlower(0x00DC)); printf("0x042F C='%C'\n", towlower(0x042F)); return(0); } And it's working fine for UCS2 (UTF-16). In PHP I can convert UTF-8 to UTF-16 by using iconv(). But PHP has not 'wide' version of strtolower()/strtoupper(). So, what can I do?