php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #44929 [PATCH] natsort and natcasesort fail if numbers in strings prepended by 0
Submitted: 2008-05-06 14:32 UTC Modified: 2009-05-02 20:29 UTC
Votes:2
Avg. Score:3.5 ± 0.5
Reproduced:2 of 2 (100.0%)
Same Version:1 (50.0%)
Same OS:2 (100.0%)
From: kae at verens dot com Assigned: rasmus (profile)
Status: Closed Package: Arrays related
PHP Version: 5.*, 6CVS (2009-05-02) OS: *
Private report: No CVE-ID: None
 [2008-05-06 14:32 UTC] kae at verens dot com
Description:
------------
natsort, which sorts arrays using natural language, does not understand numbers which begin with '0'.

Reproduce code:
---------------
<?php
$arr= array('test012','test01','test02');
natsort($arr);
var_dump($arr);


Expected result:
----------------
array
  1 => string 'test01' (length=6)
  2 => string 'test02' (length=6)
  0 => string 'test012' (length=7)


Actual result:
--------------
array
  1 => string 'test01' (length=6)
  0 => string 'test012' (length=7)
  2 => string 'test02' (length=6)


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-05-07 15:38 UTC] kae at verens dot com
I'm sorry - I think I misunderstand something here.

I have just tried with the example values in 
http://sourcefrog.net/projects/natsort/example-out.txt

Expected result: no change from the input array to the outputted array.
Actual result: the input array order is incorrect where values have a '0' in front of them

Sample code:
<?php
$arr=array(
  '1-2', '1-02', '1-20', '10-20',
  'fred', 'jane', 'pic01', 'pic2',
  'pic02', 'pic02a', 'pic3', 'pic4',
  'pic 4 else', 'pic 5', 'pic05', 'pic 5 ',
  'pic 5 something', 'pic 6', 'pic   7', 'pic100',
  'pic100a', 'pic120', 'pic121', 'pic02000',
  'tom', 'x2-g8', 'x2-y7', 'x2-y08',
  'x8-y8'
);
natsort($arr);
var_dump($arr);

note that the values in the array there are /already sorted/ according to natsort. With that in mind, the output should match the input.

Actual result:
array
  1 => string '1-02' (length=4)
  0 => string '1-2' (length=3)
  2 => string '1-20' (length=4)
  3 => string '10-20' (length=5)
  4 => string 'fred' (length=4)
  5 => string 'jane' (length=4)
  6 => string 'pic01' (length=5)
  8 => string 'pic02' (length=5)
  9 => string 'pic02a' (length=6)
  23 => string 'pic02000' (length=8)
  14 => string 'pic05' (length=5)
  7 => string 'pic2' (length=4)
  10 => string 'pic3' (length=4)
  11 => string 'pic4' (length=4)
  12 => string 'pic 4 else' (length=10)
  13 => string 'pic 5' (length=5)
  15 => string 'pic 5 ' (length=6)
  16 => string 'pic 5 something' (length=15)
  17 => string 'pic 6' (length=5)
  18 => string 'pic   7' (length=7)
  19 => string 'pic100' (length=6)
  20 => string 'pic100a' (length=7)
  21 => string 'pic120' (length=6)
  22 => string 'pic121' (length=6)
  24 => string 'tom' (length=3)
  25 => string 'x2-g8' (length=5)
  27 => string 'x2-y08' (length=6)
  26 => string 'x2-y7' (length=5)
  28 => string 'x8-y8' (length=5)
 [2008-05-07 18:35 UTC] rasmus@php.net
This should fix this one:

Index: strnatcmp.c
===================================================================
RCS file: /repository/php-src/ext/standard/strnatcmp.c,v
retrieving revision 1.10
diff -u -1 -r1.10 strnatcmp.c
--- strnatcmp.c	15 Jul 2004 01:26:03 -0000	1.10
+++ strnatcmp.c	7 May 2008 18:34:31 -0000
@@ -118,6 +118,6 @@
 		/* skip over leading spaces or zeros */
-		while (isspace((int)(unsigned char)ca))
+		while (isspace((int)(unsigned char)ca) || ca=='0')
 			ca = *++ap;
 
-		while (isspace((int)(unsigned char)cb))
+		while (isspace((int)(unsigned char)cb) || cb=='0')
 			cb = *++bp;

 [2008-08-26 22:57 UTC] jani@php.net
Rasmus, why don't you commit the patch if it fixes this..?
 [2008-08-27 07:53 UTC] rasmus@php.net
I was hoping for some feedback on the patch to hear if it actually fixes it.
 [2008-08-27 08:41 UTC] rasmus@php.net
Checking this further, this patch messes up the order of "0" in an array that contains negative values like "-123", so it still needs some work.
 [2009-04-08 14:35 UTC] bax_70 at hotmail dot com
Bad result.

$a[0]=00001;
$a[1]=00008;
$a[2]=00005;
$a[3]=000011;
$a[4]=00003;
$a[5]=000014;
natsort($a);
print_r($a);

Array ( [1] => 0 [0] => 1 [4] => 3 [2] => 5 [3] => 9 [5] => 12 )
 [2009-04-08 15:17 UTC] bax_70 at hotmail dot com
Excuseme bad example
the right one

$a[0]="00001";
$a[1]="00008";
$a[2]="00005";
$a[3]="000011";
$a[4]="00003";
$a[5]="000014";
natsort($a);
print_r($a); 

Array ( [0] => 00001 [3] => 000011 [5] => 000014 [4] => 00003 [2] => 00005 [1] => 00008 )
 [2009-04-08 18:11 UTC] rasmus@php.net
This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.

Fixed in CVS
 [2009-05-02 18:59 UTC] jani@php.net
This fails now (in both 32bit and 64bit systems):

# php -r '$a = array(.0001, .0021, -.01, -1, 0, .09, 2, -.9, 33); 
natcasesort($a); var_dump($a);'
array(9) {
  [2]=>
  float(-0.01)
  [7]=>
  float(-0.9)
  [3]=>
  int(-1)
  [4]=>
  int(0)
  [0]=>
  float(0.0001)
  [5]=>
  float(0.09)
  [1]=>
  float(0.0021)
  [6]=>
  int(2)
  [8]=>
  int(33)
}

 [2009-05-02 20:29 UTC] rasmus@php.net
Yeah, we talked about that on internals.  It's pretty much an either/or thing.  If you want numeric sorting, sort it numerically.  This is natcasesort and is more intended for filenames and such.  So this is expected.
 [2009-07-21 21:15 UTC] svn@php.net
Automatic comment from SVN on behalf of rasmus
Revision: http://svn.php.net/viewvc/?view=revision&amp;revision=284559
Log: Fix bug #49003 by tweaking the fix to bug #44929 slightly.
A 0 followed by any punctuation is now significant instead
of just 0's in front of a period.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Dec 22 11:01:30 2024 UTC