php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #42393 mb_strtoupper is replacing one of the cyrillic symbols with wrong one.
Submitted: 2007-08-23 08:06 UTC Modified: 2007-08-25 05:29 UTC
From: ivan dot delchev at softconsultgroup dot com Assigned: hirokawa (profile)
Status: Not a bug Package: mbstring related
PHP Version: 5.2.3 OS: Windows XP
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: ivan dot delchev at softconsultgroup dot com
New email:
PHP Version: OS:

 

 [2007-08-23 08:06 UTC] ivan dot delchev at softconsultgroup dot com
Description:
------------
mb_strtoupper is doind wrong transformation for "?" in cyrillic alphabetic. Whe wrong transformation "?"->"?".

Also the function is not UPPER the string!

Reproduce code:
---------------
// Ensure that the web browser encoding is UTF8 and edit application is UTF8 compatible!
	$main_string = "???? ? ????. ?????? ????. ????? ?? ?? ???????? ? ???? ???? ?? ?????!";
	var_dump($main_string);
	var_dump(mb_strtoupper($main_string));

Expected result:
----------------
Dumped result to be the same. And second string to be UPPER!

Actual result:
--------------
string(120) "???? ? ????. ?????? ????. ????? ?? ?? ???????? ? ???? ???? ?? ?????!"
string(120) "???? ? ????. ?????? ????. ????? ?? ?? ???????? ? ???? ???? ?? ?????!"


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2007-08-23 09:06 UTC] jani@php.net
Assigned to mbstring maintainer.
 [2007-08-23 14:30 UTC] hirokawa@php.net
Please show me mbstring.language setting in php.ini.

 [2007-08-23 14:40 UTC] ivan dot delchev at softconsultgroup dot com
[mbstring]
; language for internal character representation.
;mbstring.language = Japanese

; internal/script encoding.
; Some encoding cannot work as internal encoding.
; (e.g. SJIS, BIG5, ISO-2022-*)
;mbstring.internal_encoding = EUC-JP

; http input encoding.
;mbstring.http_input = auto

; http output encoding. mb_output_handler must be
; registered as output buffer to function
;mbstring.http_output = SJIS

; enable automatic encoding translation according to
; mbstring.internal_encoding setting. Input chars are
; converted to internal encoding by setting this to On.
; Note: Do _not_ use automatic encoding translation for
;       portable libs/applications.
;mbstring.encoding_translation = Off

; automatic encoding detection order.
; auto means
;mbstring.detect_order = auto

; substitute_character used when character cannot be converted
; one from another
;mbstring.substitute_character = none;

; overload(replace) single byte functions by mbstring functions.
; mail(), ereg(), etc are overloaded by mb_send_mail(), mb_ereg(),
; etc. Possible values are 0,1,2,4 or combination of them.
; For example, 7 for overload everything.
; 0: No overload
; 1: Overload mail() function
; 2: Overload str*() functions
; 4: Overload ereg*() functions
;mbstring.func_overload = 0

; enable strict encoding detection.
;mbstring.strict_encoding = Off
 [2007-08-25 05:29 UTC] hirokawa@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

You must specify specific character encoding because the conversion
table between lower/upper chars is depends on the encoding.

Please try,

mb_strtoupper($main_string,"UTF-8")
or set mbstring.internal_encoding = UTF-8 in your php.ini.



 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 17:01:58 2024 UTC