|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #29955 Support for Turkish/iso-8859-9
Submitted: 2004-09-02 17:15 UTC Modified: 2007-09-04 14:14 UTC
Avg. Score:4.5 ± 0.5
Reproduced:2 of 2 (100.0%)
Same Version:1 (50.0%)
Same OS:0 (0.0%)
From: jan at horde dot org Assigned: hirokawa
Status: Closed Package: mbstring related
PHP Version: 5CVS, 4CVS (2004-09-02) OS: Linux
Private report: No CVE-ID:
 [2004-09-02 17:15 UTC] jan at horde dot org
In ISO-8859-9 (Turkish) the uppercase letter of "i" is a dotted uppercase "I", the lowercase letter of "I" is a dotless "i". But mb_strtolower() und mb_strtoupper() simply return the ASCII uppercase or lowercase counterparts.

You get the correct result with:
setlocale(LC_ALL, 'tr_TR');
echo strtoupper('i');
echo strtolower('I');

But the wrong results with:
echo mb_strtoupper('i', 'iso-8859-9');
echo mb_strtolower('I', 'iso-8859-9');


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2005-02-22 11:10 UTC]
It turned out this is because mbstring doesn't take the 
locale into consideration.

 [2005-05-13 02:26 UTC] mustafa at deu dot edu dot tr
I get the same results like jan.

I need to get UTF-8 output for consuming a web service and I configured my php 5.0.4 with --enable-mbstring=all parameter (on linux that has been set with Turkish locale)

I see that mbstring extension has limited language support in source code. (German, English, Japanese, Korean, Russian, Chinese)

Is there a way to add our (Turkish) language to source code? Any references about this extension's source?
 [2005-05-13 08:00 UTC]
Turkish locale would need complete overhaul on the 
entire extension because the locale's character 
properties and required case folding behaviour are very 

PHP-ICU extension could support anything, but that's 
just an ongoing work by l0t3k.

 [2005-12-23 14:10 UTC]
I don't know which is the standard way (0x49 or 0xdd).
In ISO-8859-9 (Turkish),
upper case of 'i' (0x69) always should be translated to 'I' 
with dot (0xdd) ?
If yes, please let me know some URLs which describe 
the mapping.

 [2005-12-23 14:24 UTC] jan at horde dot org
See and under "Bemerkungen:" (remarks).
 [2005-12-23 14:28 UTC]
"man iso-8859-9" will tell you.

"i" maps to "0xdd"
"0xfd" maps to "I"

See also:
 [2005-12-23 14:56 UTC]
Please try using this CVS snapshot:
For Windows:

Turkish language support is added in CVS HEAD.
When mbstring.language = Turkish,
Turkish case filding will be performed in ISO-8859-9.
(upper:0x69 -> 0xdd, lower:0x49->0xfd)
Otherwise, normal case folding is performed.
(upper:0x69 -> 0x49, lower:0x49->0x69)

 [2007-01-05 14:31 UTC] jan at horde dot org
Any chance this is going to be backported to PHP 5.2? I guess mbstring is going to be obsolete with the Unicode and ICU support in PHP 6.
 [2007-01-05 14:33 UTC] jan at horde dot org
Oh, and by the way, this conversion should always happen for iso-8859-9, not only if mbstring.language is set to Turkish, because this is completely useless in real world applications.
 [2007-08-17 22:19 UTC]
This change is already back ported to PHP 5.2.
In my understanding, it shouldn't always applied to ISO-8859-9,
because the conversion result is depends on the locale.
(correct ?)

 [2007-08-23 16:33 UTC] jan at horde dot org
No. The conversion has to be done this way for iso-8859-9 always, not only if the current locale is Turkish. Turkish is the only language that uses this charset.
 [2007-08-23 23:03 UTC]
Feedback given.
 [2007-09-04 14:14 UTC]
This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
Thank you for the report, and for helping us make PHP better.

PHP Copyright © 2001-2015 The PHP Group
All rights reserved.
Last updated: Thu Nov 26 15:01:32 2015 UTC