php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #14655 ucwords failing randomly with international characters
Submitted: 2001-12-21 22:32 UTC Modified: 2002-12-25 02:30 UTC
Votes:4
Avg. Score:4.2 ± 0.8
Reproduced:3 of 4 (75.0%)
Same Version:0 (0.0%)
Same OS:1 (33.3%)
From: xtango at netcombbs dot com dot ar Assigned: hholzgra (profile)
Status: Closed Package: Strings related
PHP Version: 4.1.0 OS: Windows XP Pro (spanish)
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: xtango at netcombbs dot com dot ar
New email:
PHP Version: OS:

 

 [2001-12-21 22:32 UTC] xtango at netcombbs dot com dot ar
When using the ucwords() function with strings containing international characters, the function will return incorrect values. The return value will be different every time the function is called.

I've seen bogus bug reports for this function so we have a slight chance for this to be yet another bogus bug, but look at this:

echo setlocale(LC_ALL, "0");
Output:
LC_COLLATE=C;LC_CTYPE=Spanish_Argentina.1252;LC_MONETARY=C;LC_NUMERIC=C;LC_TIME=C

---
echo ucwords('????????????????');
Output:
????????????????
Output again (refresh button on browser):
????????????????
Output again:
????????????????

etc. etc.

Now, if I do setlocale(LC_ALL, "spanish"), the restult is still the same.

I've noted that the problem is not the intl. char, but the char after that. This is, ucwords("ca?as") will return "Ca?as" or "Ca?As", randomly.

Scenario:
Windows XP Pro (spanish)
Apache 1.3.22
PHP 4.1.0

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2001-12-23 15:51 UTC] venaas@php.net
To me this sounds like a system problem. PHP only uses the
systems notion of upper and space. When the character
following the international character it is probably
because the international character is treated as a
space by the system (isspace() returning true). It's quite
weird that the results are not consistent. The PHP function
itself is quite straightforward, so I suspect the locale is
not the same every time, or that the systems isspace() and
possibly toupper() is broken. I can't say for sure though.


 [2002-01-22 08:03 UTC] martin at humany dot com
Some additional comments:

I cant reproduce this in Linux (meaning the returned data doesnt change), but with PHP 4.1.1 in Window XP the problem exists, but only when i change locale, my default locale is:

LC_COLLATE=C;LC_CTYPE=English_United States.1252;LC_MONETARY=C;LC_NUMERIC=C;LC_TIME=C

and the output here is:
????????????????

With linux:
????????????????


But if i change locale with:
setlocale(LC_ALL, "english");
(Locale: English_United States.1252)

The returned data differs at each function call
 [2002-04-21 06:24 UTC] bs_php at infeer dot com
Confirming your bug (PHP 4.1.2 on W2k)
setlocale() on windows is the problem!
See bug #16718 
http://bugs.php.net/bug.php?id=16718
 [2002-06-17 07:02 UTC] hholzgra@php.net
setlocale() doesn't work well with multithreaded servers
as the current locale is set globaly for all threads
 [2002-07-13 18:10 UTC] derick@php.net
chancing status
 [2002-08-27 07:50 UTC] fille at fukt dot bth dot se
I had this bug to. To get around it, I added some 'hair' to the code.

$string=ucwords($string);
//Bugfix from here on
for($i=0;$i<strlen($string);$i++)
	if((ctype_upper($string[$i]) &&( $string[$i-1]==" " || $i==0 ))!=TRUE)
		$string[$i]=strtolower($string[$i]);

I'm a PHP newbie so I guess some of you can come up with a brighter solution. This code lowercases _all_ letters that is not either the first in the string or following a space. If your application does not meet both this criterias, you might not want this code.
 [2002-12-21 09:49 UTC] moriyoshi@php.net
You can use mb_convert_case() instead of ucwords.
Could I close this bug?

 [2002-12-23 10:15 UTC] xtango at netcombbs dot com dot ar
Somebody said that this is a problem with setlocale (Windows) and not with ucwords. This issue is beign debated in 14655 and 16718. Also, mb_convert_case() seems to be safe.
I guess this bug can be closed.
 [2002-12-25 02:30 UTC] moriyoshi@php.net
Closing...
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Nov 24 02:01:28 2024 UTC