php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #42393 mb_strtoupper is replacing one of the cyrillic symbols with wrong one.
Submitted: 2007-08-23 08:06 UTC Modified: 2007-08-25 05:29 UTC
From: ivan dot delchev at softconsultgroup dot com Assigned: hirokawa (profile)
Status: Not a bug Package: mbstring related
PHP Version: 5.2.3 OS: Windows XP
Private report: No CVE-ID: None
 [2007-08-23 08:06 UTC] ivan dot delchev at softconsultgroup dot com
Description:
------------
mb_strtoupper is doind wrong transformation for "?" in cyrillic alphabetic. Whe wrong transformation "?"->"?".

Also the function is not UPPER the string!

Reproduce code:
---------------
// Ensure that the web browser encoding is UTF8 and edit application is UTF8 compatible!
	$main_string = "???? ? ????. ?????? ????. ????? ?? ?? ???????? ? ???? ???? ?? ?????!";
	var_dump($main_string);
	var_dump(mb_strtoupper($main_string));

Expected result:
----------------
Dumped result to be the same. And second string to be UPPER!

Actual result:
--------------
string(120) "???? ? ????. ?????? ????. ????? ?? ?? ???????? ? ???? ???? ?? ?????!"
string(120) "???? ? ????. ?????? ????. ????? ?? ?? ???????? ? ???? ???? ?? ?????!"


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2007-08-23 09:06 UTC] jani@php.net
Assigned to mbstring maintainer.
 [2007-08-23 14:30 UTC] hirokawa@php.net
Please show me mbstring.language setting in php.ini.

 [2007-08-23 14:40 UTC] ivan dot delchev at softconsultgroup dot com
[mbstring]
; language for internal character representation.
;mbstring.language = Japanese

; internal/script encoding.
; Some encoding cannot work as internal encoding.
; (e.g. SJIS, BIG5, ISO-2022-*)
;mbstring.internal_encoding = EUC-JP

; http input encoding.
;mbstring.http_input = auto

; http output encoding. mb_output_handler must be
; registered as output buffer to function
;mbstring.http_output = SJIS

; enable automatic encoding translation according to
; mbstring.internal_encoding setting. Input chars are
; converted to internal encoding by setting this to On.
; Note: Do _not_ use automatic encoding translation for
;       portable libs/applications.
;mbstring.encoding_translation = Off

; automatic encoding detection order.
; auto means
;mbstring.detect_order = auto

; substitute_character used when character cannot be converted
; one from another
;mbstring.substitute_character = none;

; overload(replace) single byte functions by mbstring functions.
; mail(), ereg(), etc are overloaded by mb_send_mail(), mb_ereg(),
; etc. Possible values are 0,1,2,4 or combination of them.
; For example, 7 for overload everything.
; 0: No overload
; 1: Overload mail() function
; 2: Overload str*() functions
; 4: Overload ereg*() functions
;mbstring.func_overload = 0

; enable strict encoding detection.
;mbstring.strict_encoding = Off
 [2007-08-25 05:29 UTC] hirokawa@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

You must specify specific character encoding because the conversion
table between lower/upper chars is depends on the encoding.

Please try,

mb_strtoupper($main_string,"UTF-8")
or set mbstring.internal_encoding = UTF-8 in your php.ini.



 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Mar 29 09:01:28 2024 UTC