PHP :: Bug #19795 :: Problems with strnatcmp and strnatcasecmp

Bug #19795

Problems with strnatcmp and strnatcasecmp

Submitted:

2002-10-07 03:51 UTC

Modified:

2003-04-16 16:11 UTC

Votes:	1
Avg. Score:	5.0 ± 0.0
Reproduced:	1 of 1 (100.0%)
Same Version:	0 (0.0%)
Same OS:	1 (100.0%)

From:

doc at nitramlexa dot com

Assigned:

Status:

Closed

Package:

Strings related

PHP Version:

4.2.2

OS:

FreeBSD 4.6

Private report:

CVE-ID:

None

View Developer Edit

Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.

Password:

Status:
Package:
Bug Type:
Summary:
From:	doc at nitramlexa dot com
New email:
PHP Version:		OS:

New Comment:

[2002-10-07 03:51 UTC] doc at nitramlexa dot com

Characters with an ASCII value above 127 is considered to be the lower value. This is a bit unfortunate if you use a language with special characters (like the 3 danish letters after z in the list below).

  $list = array('a', 1, '2', '12', '1', 'z', '?', '?', '?', chr(137), chr(128));   
  usort($list, 'strnatcmp'); 
  var_dump($list);

The values above 127 are sorted correctly, they should just be considered higher than A-z.

I believe I have seen this bug on a Solaris Unix as well. But that was a while ago and I can not provide any detailed information on that situation.

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports

[2002-10-07 03:58 UTC] derick@php.net

Did you set-up your locales correctly?

Derick

[2002-10-07 06:00 UTC] doc at nitramlexa dot com

I am quite sure I did that yes. I set locale for LC_ALL and strftime is working as it should.

If I understand the documentation correctly, this function should work as strcmp except for the fact that numbers are sorted in a natural order. And strcmp does sort as I expect.

usort using strcmp:

array(11) {
  [0]=>
  string(1) "1"
  [1]=>
  int(1)
  [2]=>
  string(2) "12"
  [3]=>
  string(1) "2"
  [4]=>
  string(1) "a"
  [5]=>
  string(1) "z"
  [6]=>
  string(1) "?"
  [7]=>
  string(1) "?"
  [8]=>
  string(1) "?"
  [9]=>
  string(1) "?"
  [10]=>
  string(1) "?"
}



using strnatcmp:

array(11) {
  [0]=>
  string(1) "?"
  [1]=>
  string(1) "?"
  [2]=>
  string(1) "?"
  [3]=>
  string(1) "?"
  [4]=>
  string(1) "?"
  [5]=>
  string(1) "1"
  [6]=>
  int(1)
  [7]=>
  string(1) "2"
  [8]=>
  string(2) "12"
  [9]=>
  string(1) "a"
  [10]=>
  string(1) "z"
}

[2003-01-27 16:11 UTC] kamikaze at yifan dot net

I have the same problem, with ???. strtoupper() does, for example, not uppercase those letters.

Also i 4.2.3

[2003-04-16 16:11 UTC] moriyoshi@php.net

This bug has been fixed in CVS.

In case this was a PHP problem, snapshots of the sources are packaged
every three hours; this change will be in the next snapshot. You can
grab the snapshot at http://snaps.php.net/.
 
In case this was a documentation problem, the fix will show up soon at
http://www.php.net/manual/.

In case this was a PHP.net website problem, the change will show
up on the PHP.net site and on the mirror sites in short time.
 
Thank you for the report, and for helping us make PHP better.

[2004-06-28 14:55 UTC] mikael at chl dot chalmers dot se

This bug seems to have popped back again in version 4.3.7

When using setlocale(LC_ALL, 'sv_SE') the national chars ??? get sorted before other international chars, they should appear at the bottom.

[2004-07-16 04:32 UTC] mbp at sourcefrog dot net

This bug does seem to still be present in php5 CVS.  The comparison is simply by byte values, not taking character set or locale into account.

[2004-07-29 10:42 UTC] larry at kamsha dot ru

I have PHP 5.0.0 release with same bug (with cyrillic characters). I've analized sources and found out that comparison is made on "char" values. So all extended characters (with hi-order bit 1) treated as negative.
Changing "char" to "unsigned char" would place national characters after english characters (greater in terms of compare), that is correct, but locale-specific collation problem will remain. This is not important for CP1251 or CP866 cyrillic encodings, because theese already have characters sorted by values correctly. But there will be problem for KOI8-R encoding (for which this is not the case).
The problem can be easily solved by using "strcoll" (in place of dumb value compare), I guess.

[2012-02-09 07:32 UTC] redrat at mail dot ru

This bug still has place in PHP 5.3.10 for all cyrillic letters (and I think for other non-ASCII letters too). This bug-report was filled almost 10 years ago! Could anybody do something with it?

	php.net \| support \| documentation \| report a bug \| advanced search \| search howto \| statistics \| random bug \| login
go to bug id or search bugs for


Copyright © 2001-2025 The PHP Group All rights reserved.	Last updated: Fri Jul 18 23:00:02 2025 UTC