php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #18401 shuffle() broken
Submitted: 2002-07-18 01:03 UTC Modified: 2002-08-15 15:09 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:1 (100.0%)
From: adam at trachtenberg dot com Assigned:
Status: Closed Package: Arrays related
PHP Version: 4CVS-2002-08-12 OS: Redhat 7.3
Private report: No CVE-ID: None
 [2002-07-18 01:03 UTC] adam at trachtenberg dot com
The shuffle() function in the CVS is (still) broken. It does not correctly generate all possible combinations of the array. (I know this function was recently updated, this test is with new version.)

Here is a diff to fix the problem:

--- array.c	Thu Jul 18 00:50:54 2002
+++ array.new.c	Thu Jul 18 00:49:35 2002
@@ -1441,7 +1441,7 @@
 {
 	Bucket **elems, *temp;
 	HashTable *hash;
-	int j, n_elems, cur_elem = 0, rnd_idx, n_left;
+	int j, n_elems, rnd_idx, n_left;
 
 	n_elems = zend_hash_num_elements(Z_ARRVAL_P(array));
 	
@@ -1457,13 +1457,12 @@
 		elems[j++] = temp;
 	while (--n_left) {
 		rnd_idx = php_rand(TSRMLS_C);
-		RAND_RANGE(rnd_idx, cur_elem, n_left, PHP_RAND_MAX);
-		if (rnd_idx != cur_elem) {
-			temp = elems[cur_elem];
-			elems[cur_elem] = elems[rnd_idx];
+		RAND_RANGE(rnd_idx, 0, n_left, PHP_RAND_MAX);
+		if (rnd_idx != n_left) {
+			temp = elems[n_left];
+			elems[n_left] = elems[rnd_idx];
 			elems[rnd_idx] = temp;
 		}
-		cur_elem++;
 	}
 
 	HANDLE_BLOCK_INTERRUPTIONS();

Here is some PHP that will reproduce the problem. As you can see the built-in shuffle() only generates 3 of the 6 possible combinations, while the userland array_shuffle() correctly generates all 6 with equal likelyhood.

<?php

function array_shuffle(&$array) {
  $i = count($array);
  while (--$i) {
    $j = mt_rand(0, $i);
    if ($i != $j) {
      
      /* swap elements */
      $tmp = $array[$j];
      $array[$j] = $array[$i];
      $array[$i] = $tmp;
    }
  }
}

function test($function) {
  $a = array();

  for ($i = 0; $i < 10000; $i++) {
    $p = range(1,3);
    $function($p);
    $s = join('', $p);
    $a[$s]++;
}

  print "$function\n";
  
  foreach($a as $k => $v) {
    print "$k: $v\n";
  }

}

test('shuffle');
test('array_shuffle');

?>

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2002-08-11 02:30 UTC] kalowsky@php.net
I'm pretty Brad has been playing with this code.  Can you test again, and report back with a new snapshot?
 [2002-08-12 03:10 UTC] adam at trachtenberg dot com
I have just tried it with the latest CVS and I still get the same results. Brad's changes were in a different section of the file.
 [2002-08-14 03:04 UTC] yohgaki@php.net
-		if (rnd_idx != cur_elem) {
-			temp = elems[cur_elem];
-			elems[cur_elem] = elems[rnd_idx];
+		RAND_RANGE(rnd_idx, 0, n_left, PHP_RAND_MAX);
+		if (rnd_idx != n_left) {
+			temp = elems[n_left];
+			elems[n_left] = elems[rnd_idx];
 			elems[rnd_idx] = temp;

Thanks for detailed report. This diff clearly show where to fix. Swapping values requires 3 lines!!

 [2002-08-14 03:27 UTC] yohgaki@php.net
Oops. Now I see what's the point.
This patch seems good to me. There is no point 
changing cur_elem. 

Anyone?





 [2002-08-14 11:59 UTC] adam at trachtenberg dot com
Yes, this is the Fisher-Yates algorithm. See Perl Cookbook or "perldoc -f shuffle" for supporting details.
 [2002-08-15 15:09 UTC] kalowsky@php.net
Fixed in CVS now.  Thanks for your patch.  Please follow up on any fallout too :)
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Wed Apr 24 02:01:30 2024 UTC