|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #22003 XML parsing and strtoupper broken in Turkish
Submitted: 2003-02-01 18:23 UTC Modified: 2003-03-02 13:24 UTC
From: spud at nothingness dot org Assigned:
Status: Not a bug Package: XML related
PHP Version: 4.3.0 OS: Linux Redhat 7.2
Private report: No CVE-ID: None
 [2003-02-01 18:23 UTC] spud at nothingness dot org
I was trying to do some XML-RPC stuff in PHP while my locale was set to 'tr_TR.utf8'. I (and others) have reported other bugs related to Turkish and PHP because of the odd Turkish relationship with the letter "i".

While issues related to the lower-casing of object classes have been resolved in 4.3.0, I encountered similar problems with XML parsing.

Specifically, when the locale is set to "tr_TR.utf8' and an xml parser is created with CASE_FOLDING enabled, "INT" tags and "STRING" tags are labels as "iNT" and "STRiNG". Consequently, functions designed to recognize tag names based on all uppercase letters fail to recognize these tags.

The problem is also evident in the basic strtoupper function built into PHP. The following code demonstrates both examples:

$chk = setlocale(LC_ALL,'tr_TR.utf8');
if ($chk) echo ("Setting language to Turkish<br>\n");

$x = "<string>foo</string>";
echo ("x is ".htmlspecialchars($x,ENT_COMPAT,'utf-8')."<br>\n");
$y = strtoupper($x);
echo ("Strtoupper yields ".htmlspecialchars($y,ENT_COMPAT,'utf-8')."<br>\n");

function startElement($parser, $name, $attrs) {
    print "Start tag name: $name<br>\n";

function endElement($parser, $name) {
    print "End tag name: $name<br>\n";
function charData($parser, $data) {
    print "Character Data: $data<br>\n";

$parser = xml_parser_create('utf-8');
xml_set_element_handler($parser, "startElement", "endElement");
xml_set_character_data_handler($parser, "charData"); 
echo ("Parsing with utf-8 parser, case_folding enabled<br>\n");


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2003-03-02 13:24 UTC]
According to the unicode specification, case folding behaviour varies by locale settings. So this is not a bug.

See for detail.

Also related to the following patch on i18n part of glibc source:

PHP Copyright © 2001-2022 The PHP Group
All rights reserved.
Last updated: Wed Dec 07 21:04:14 2022 UTC