php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #32958 strtoupper/strtolower do not convert some 8859-2 characters
Submitted: 2005-05-05 17:34 UTC Modified: 2005-05-07 01:08 UTC
Votes:1
Avg. Score:4.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: kirchner at cm-sec dot cz Assigned:
Status: Not a bug Package: Strings related
PHP Version: 4.3.10 OS: Linux Debian
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: kirchner at cm-sec dot cz
New email:
PHP Version: OS:

 

 [2005-05-05 17:34 UTC] kirchner at cm-sec dot cz
Description:
------------
Functions "strtolower" and "strtoupper" do not convert some Czech characters from 8859-2 charset. I tested it and I found, that these functions do not convert characters "?"/"?", "?"/"?" and "?"/"?", but other special Czech characters are converted correctly.
Remark:
We most often use two charsets in our country: ISO 8859-2 for Unix like systems and CP1250 (windows-1250) for Windows systems. Special Czech characters have the same hexadecimal value in both charsets except of mentioned characters (?,?,?,?,?,?).

I thing, that my locale is set correctly. Here are three rows from output of the phpinfo() function.

default_charset	ISO-8859-2	ISO-8859-2
_ENV["LC_ALL"]	cs_CZ
_ENV["LANG"]	cs_CZ

Best regards


Jan Kirchner


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-05-05 20:21 UTC] rasmus@php.net
Sounds like a locale bug to me.  Could you compile this test program?

#include <stdio.h>
#include <locale.h>

char *php_strtoupper(char *s, size_t len) {
    unsigned char *c, *e;

    c = s;
    e = c+len;

    while (c < e) {
        *c = toupper(*c);
        c++;
    }
    return s;
}

main(void) {
    char str[2];
    str[0] = 0xb9;
    str[1] = 0x0;
    setlocale(LC_ALL,"cs_CZ");
    puts(php_strtoupper(str,1));
}

Just save it to s.c, for example and type: make s
Then run it like this: ./s | od -x
And add the output here.
As far as I can tell 0xb9 gets uppercased to 0xa9 in 8859-2, so you should be seeing an a9 there in the output if your system's locale is working correctly.
 [2005-05-06 08:36 UTC] kirchner at cm-sec dot cz
Hello,
you can be right. I used your test program and output is:

0000000 0aa9
0000002

When I convert 0xaa9 to dec = 2729 and I write in Linux console <Alt>+<2729> I obtain the same character like when I write here <Alt>+<169>, which is dec from 0xa9. Both outputs are correct.

When I change "str[0] to "str[0] = 0x61;" (letter "a") in main function of your program, I obtain 

0000000 0a41
0000002

Jan Kirchner
 [2005-05-06 23:07 UTC] sniper@php.net
Works fine for me.

 [2005-05-07 01:08 UTC] rasmus@php.net
So, the C progran correctly converted b9 to a9.  Now try the same thing with PHP:

php -r '$str = chr(0xb9);setlocale(LC_ALL,"cs_CZ");echo strtoupper($str);' | od -x

If that spits out a9 then PHP is working just fine and you are doing something wrong somewhere else.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu May 02 18:01:32 2024 UTC