php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #47370 array_unique has backward compatibility problem, and SORT_REGULAR is confusing
Submitted: 2009-02-12 16:22 UTC Modified: 2009-05-15 17:10 UTC
Votes:9
Avg. Score:4.9 ± 0.3
Reproduced:9 of 9 (100.0%)
Same Version:8 (88.9%)
Same OS:5 (55.6%)
From: for-bugs at hnw dot jp Assigned:
Status: Closed Package: Documentation problem
PHP Version: 5.2.9 OS: *
Private report: No CVE-ID: None
 [2009-02-12 16:22 UTC] for-bugs at hnw dot jp
Description:
------------
In PHP5.2.9RC1, array_unique() returns different result because of element ordering in array. Reproduce code shows this difference.

It is because SORT_REGULAR never cast array elements and compares them with ==. I think it's better for SORT_REGULAR to compare elements by using === instead of ==.

PHP 5.2.9RC1's array_unique() also has backward compatibility ploblem. Considering backward compatibility, default sort_flag should be SORT_STRING.

Reproduce code:
---------------
<?php
var_dump(arary_unique(array(0,"","0"))));
var_dump(arary_unique(array("","0",0))));

Expected result:
----------------
I don't know, but 2 results should be same.

Actual result:
--------------
array(1) {
  [0]=>
  int(0)
}
array(2) {
  [0]=>
  string(0) ""
  [1]=>
  string(1) "0"
}

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-02-12 16:25 UTC] for-bugs at hnw dot jp
Sorry, reproduce code was incorrect. Correct code is below:

<?php
var_dump(array_unique(array(0,"","0")));
var_dump(array_unique(array("","0",0)));
 [2009-02-12 18:12 UTC] moriyoshi@php.net
Verified with 5.2, 5.3, HEAD.

 [2009-02-12 18:58 UTC] moriyoshi@php.net
Please try using this CVS snapshot:

  http://snaps.php.net/php5.2-latest.tar.gz
 
For Windows:

  http://windows.php.net/snapshots/


 [2009-02-13 01:53 UTC] for-bugs at hnw dot jp
Thank you so much. The snapshot returns same result to PHP 5.2.8 with reproduce code. Such as:

array(2) {
  [0]=>
  int(0)
  [1]=>
  string(0) ""
}
array(2) {
  [0]=>
  string(0) ""
  [1]=>
  string(1) "0"
}
 [2009-02-13 22:27 UTC] andrei@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

The slight BC breakage is negligible compared to the benefits of getting it to work properly.
 [2009-02-14 08:22 UTC] moriyoshi@php.net
This was not discussed, and thus no bogus thing.
 [2009-02-14 08:28 UTC] for-bugs at hnw dot jp
OK, you think comparing elements as string is harmful, doesn't you?


Then, how about array_diff() or array_intersect()? They compare array elements with string representation. Isn't it harmful?
 [2009-03-01 00:51 UTC] for-bugs at hnw dot jp
Hi, Andrei. Here's another terrible example.

<?php
$a=array("10","1az", "1e1");
var_dump(array_unique($a));
$b=array("1e1","10", "1az");
var_dump(array_unique($b));


The result is:


array(3) {
  [0]=>
  string(2) "10"
  [1]=>
  string(3) "1az"
  [2]=>
  string(3) "1e1"
}
array(2) {
  [0]=>
  string(3) "1e1"
  [2]=>
  string(3) "1az"
}


The array $a and $b have same 3 elements with different ordering. Although, two array_unique() returns different result.

First array_unique() returns 3 elements in spite of the fact that "10" equals "1e1" with ==.

In fact, the two arrays are both sorted about SORT_REGULAR. Because "10" < "1az" , "1az" < "1e1" and "1e1"=="10". Sorting with SORT_REGULAR is not stable, and unique element is not always in neighbor.

This behavior is not obvious for almost all PHP programmer. You should explain the detail of your function in reference manual.
 [2009-03-01 07:03 UTC] moriyoshi@php.net
Andrei, you must add a note about the behavioral change.
 [2009-04-30 08:26 UTC] jani@php.net
See also bug #48115 (yes, WTF?!)
 [2009-04-30 09:46 UTC] nospam at ez dot no
Hi Guys,

We are facing the same BC problem with array_unique.
Consider following test script:

<?php

$array = array( '400.00', '400' );

// Here 400 value exists
// array(2) { [0]=> string(6) "400.00" [1]=> string(3) "400" }
var_dump($array);

$arrayTest1 = array_unique( $array );

// Here 400 value is missing
// array(1) { [0]=> string(6) "400.00" }
// Prior verstion 5.2.9 this always returned array(2) { [0]=> 
string(6) "400.00" [1]=> string(3) "400" }
var_dump($arrayTest1);

$arrayTest2 = array_unique( $array, SORT_STRING );

// Here 400 value exists
// array(2) { [0]=> string(6) "400.00" [1]=> string(3) "400" }
var_dump( $arrayTest2 );

?>

This is definitely BC break in 5.2.9 as comparing '400.000' and '400' 
in array_unique in PHP versions prior 5.2.9 returned both values. In 
PHP 5.2.9 it return '400.000'.
 [2009-05-14 08:43 UTC] derick@php.net
Andrei, can you please have a look at this? This is BC break for quite a few applications.
 [2009-05-15 17:10 UTC] jani@php.net
The change is reverted in next releases (all branches).
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Apr 18 16:01:29 2024 UTC