php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #19795 Problems with strnatcmp and strnatcasecmp
Submitted: 2002-10-07 03:51 UTC Modified: 2003-04-16 16:11 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:1 (100.0%)
From: doc at nitramlexa dot com Assigned:
Status: Closed Package: Strings related
PHP Version: 4.2.2 OS: FreeBSD 4.6
Private report: No CVE-ID: None
 [2002-10-07 03:51 UTC] doc at nitramlexa dot com
Characters with an ASCII value above 127 is considered to be the lower value. This is a bit unfortunate if you use a language with special characters (like the 3 danish letters after z in the list below).

  $list = array('a', 1, '2', '12', '1', 'z', '?', '?', '?', chr(137), chr(128));   
  usort($list, 'strnatcmp'); 
  var_dump($list);

The values above 127 are sorted correctly, they should just be considered higher than A-z.

I believe I have seen this bug on a Solaris Unix as well. But that was a while ago and I can not provide any detailed information on that situation.

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2002-10-07 03:58 UTC] derick@php.net
Did you set-up your locales correctly?

Derick
 [2002-10-07 06:00 UTC] doc at nitramlexa dot com
I am quite sure I did that yes. I set locale for LC_ALL and strftime is working as it should.

If I understand the documentation correctly, this function should work as strcmp except for the fact that numbers are sorted in a natural order. And strcmp does sort as I expect.

usort using strcmp:

array(11) {
  [0]=>
  string(1) "1"
  [1]=>
  int(1)
  [2]=>
  string(2) "12"
  [3]=>
  string(1) "2"
  [4]=>
  string(1) "a"
  [5]=>
  string(1) "z"
  [6]=>
  string(1) "?"
  [7]=>
  string(1) "?"
  [8]=>
  string(1) "?"
  [9]=>
  string(1) "?"
  [10]=>
  string(1) "?"
}



using strnatcmp:

array(11) {
  [0]=>
  string(1) "?"
  [1]=>
  string(1) "?"
  [2]=>
  string(1) "?"
  [3]=>
  string(1) "?"
  [4]=>
  string(1) "?"
  [5]=>
  string(1) "1"
  [6]=>
  int(1)
  [7]=>
  string(1) "2"
  [8]=>
  string(2) "12"
  [9]=>
  string(1) "a"
  [10]=>
  string(1) "z"
}
 [2003-01-27 16:11 UTC] kamikaze at yifan dot net
I have the same problem, with ???. strtoupper() does, for example, not uppercase those letters.

Also i 4.2.3
 [2003-04-16 16:11 UTC] moriyoshi@php.net
This bug has been fixed in CVS.

In case this was a PHP problem, snapshots of the sources are packaged
every three hours; this change will be in the next snapshot. You can
grab the snapshot at http://snaps.php.net/.
 
In case this was a documentation problem, the fix will show up soon at
http://www.php.net/manual/.

In case this was a PHP.net website problem, the change will show
up on the PHP.net site and on the mirror sites in short time.
 
Thank you for the report, and for helping us make PHP better.


 [2004-06-28 14:55 UTC] mikael at chl dot chalmers dot se
This bug seems to have popped back again in version 4.3.7

When using setlocale(LC_ALL, 'sv_SE') the national chars ??? get sorted before other international chars, they should appear at the bottom.
 [2004-07-16 04:32 UTC] mbp at sourcefrog dot net
This bug does seem to still be present in php5 CVS.  The comparison is simply by byte values, not taking character set or locale into account.
 [2004-07-29 10:42 UTC] larry at kamsha dot ru
I have PHP 5.0.0 release with same bug (with cyrillic characters). I've analized sources and found out that comparison is made on "char" values. So all extended characters (with hi-order bit 1) treated as negative.
Changing "char" to "unsigned char" would place national characters after english characters (greater in terms of compare), that is correct, but locale-specific collation problem will remain. This is not important for CP1251 or CP866 cyrillic encodings, because theese already have characters sorted by values correctly. But there will be problem for KOI8-R encoding (for which this is not the case).
The problem can be easily solved by using "strcoll" (in place of dumb value compare), I guess.
 [2012-02-09 07:32 UTC] redrat at mail dot ru
This bug still has place in PHP 5.3.10 for all cyrillic letters (and I think for other non-ASCII letters too). This bug-report was filled almost 10 years ago! Could anybody do something with it?
 
PHP Copyright © 2001-2019 The PHP Group
All rights reserved.
Last updated: Tue May 21 22:01:26 2019 UTC