php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #35583 Calling user defined functions after setlocale("tr_TR") produces errors
Submitted: 2005-12-07 15:21 UTC Modified: 2005-12-08 09:27 UTC
From: sezai dot yilmaz at pro-g dot com dot tr Assigned:
Status: Not a bug Package: Scripting Engine problem
PHP Version: 5.1.1 OS: Debian GNU/Linux 2.6.12
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: sezai dot yilmaz at pro-g dot com dot tr
New email:
PHP Version: OS:

 

 [2005-12-07 15:21 UTC] sezai dot yilmaz at pro-g dot com dot tr
Description:
------------
This bug has duplicates with status closed here (for example #18556). But the problem is still continue to survive in the latest PHP.

The problem is case insensitive function name handling in PHP's scripting engine and the native behavior of Turkish language. Turkish language has a letter "I" of which the lowercase is dotless i ("ı"). PHP engine itself running on different locale lowercases and registers user defined functions with its running environment's locale. But while parsing the user's PHP source code and the code itself has a call setlocale(LC_ALL, "tr_TR") then the user defined functions are lowercased with Turkish locale. If the function name contains the letter "I" then it is never matched with the previously registered lowercased one.

For example getImageDir() is registered as getimagedir() but with Turkish locale set by user code it is compared with getımagedir() and the results is "undefined function". I think this is a developer bug because of using locale aware functions for lowercasing native english keywords.

The problem affects function names, class names, method names, ... all incasesensitive staff. And all of them have to be native English.

I made a solution by modifying the source file "php-5.1.1/Zend/zend_operators.c". Instead of using the locale aware "tolower(int)" libc function in zend_str_tolower(), zend_str_tolower_copy() functions, I I wrote down a "tolower_english(int)" equivalent to "tolower()" and called it in zend_str_tolower() / zend_str_tolower() instead of the original one "tolower()".

Function  names, class names, method names are not locale aware staff, so why to use locale aware tolower() libc function? Function names, class names, method names are natively in English locale, aren't they? I don't understand why developers use locale aware functions for native english keywords like function names, class names, method names, ... etc.

I am tired of patching the PHP code with my non-perfect solution.

Reproduce code:
---------------
My solution is not perfect, it may cause unexpected problems. But I use it without problems until now.

It is better to use strictly English locale functions instead of locale aware one for collating natively English strings.

http://www.pro-g.com.tr/php_turkish_bug/zend_operators.c.diff



Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-12-07 17:08 UTC] sniper@php.net
Please do not submit the same bug more than once. An existing
bug report already describes this very problem. Even if you feel
that your issue is somewhat different, the resolution is likely
to be the same. 

Thank you for your interest in PHP.


 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Mar 29 10:01:28 2024 UTC