php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #45301 Serious flaw in array_rand()
Submitted: 2008-06-18 16:05 UTC Modified: 2016-09-09 05:43 UTC
Votes:32
Avg. Score:4.3 ± 0.8
Reproduced:29 of 30 (96.7%)
Same Version:13 (44.8%)
Same OS:21 (72.4%)
From: payton2558 at googlemail dot com Assigned: pajoye (profile)
Status: Closed Package: Math related
PHP Version: * OS: win32 only
Private report: No CVE-ID: None
 [2008-06-18 16:05 UTC] payton2558 at googlemail dot com
Description:
------------
The reproduce code demonstrates the bug. You can modify the code in different ways to affect the severity of the output.

Appears to require Windows.
Please note I have tested on 2 different machines and 3 versions of php. I have also confirmed this with a couple of users on IRC.

mt_rand may also be affected but not as badly

Unrelated: bugs.php your CAPTCHA system is the worst I could ever expect for a programming related group

Reproduce code:
---------------
<?php
function RandomNumber() {
	
	$word1 = file('word1.txt');	 // word1 and word2.txt can be made by fwriting "word\n" 50000 times. Different filesizes affect bug.
	$word2 = file('word2.txt');
		
	$rword1 = trim($word1[array_rand($word1)]);
	$rword2 = trim($word2[array_rand($word2)]);
	
	$rnum = rand(1,999);	
	
	return $rnum;
}

for ($i=0; $i<20; $i++) {
	
	echo RandomNumber()."\n";
}
?>

Expected result:
----------------
20 random looking numbers

Actual result:
--------------
20 identical numbers or in other cases, severely unrandom numbers

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-06-18 16:20 UTC] pajoye@php.net
On a related note: #45184
 [2008-06-18 18:06 UTC] sv4php at fmethod dot com
Confirmed on Apache 2.2, Windows XP SP2 with PHP 5.2.6.

Confirmed also without building files (just building the array directly in a loop).

Requires Windows, doesn't seem to affect mt_rand();
 [2008-06-18 19:40 UTC] payton2558 at googlemail dot com
Here's the shortest version. Try varying the array_fill num parameter. Lower numbers appear to increase the randomness.

Appears problem is array_rand interfering with the random seed but please investigate further than that as I'm certain I experienced this problem before ever using array_rand (rand(0,count($a)-1) works in its place and doesn't trigger bug in this example). 

<?php
	
$a = array_fill(0, 100000, "word");

for ($i=0; $i<20; $i++) {
	
	array_rand($a);
	
	echo rand(1,9999)."\n";
}
?>
 [2008-06-18 21:04 UTC] crrodriguez at suse dot de
What about merging a patch that circulated in @internals that made rand() and alias to mt_rand() and be done with this ?
 [2008-06-18 21:26 UTC] pajoye@php.net
> What about merging a patch that circulated in @internals that made
> rand() and alias to mt_rand() and be done with this ?

Because it may not fix the problem? (see the other report today and two weeks ago).


 [2008-07-02 11:47 UTC] jani@php.net
See also bug #45302
 [2009-10-30 22:15 UTC] scott046 at hotmail dot com
If anybody is interested, this code:

<?php

print("20 element array; apparently no problem<br>\r\n");
$array1 = array();
$counter1 = 0;
while($counter1 < 20) {
	$array1[] = $counter1;
	$counter1++;
}
$print_counter1 = 0;
while($print_counter1 < 10) {
	print($array1[array_rand($array1)] . "<br>\r\n");
	$print_counter1++;
}

print("<br>\r\n<br>\r\n200 element array; apparently no problem<br>\r\n");
$array1 = array();
$counter1 = 0;
while($counter1 < 200) {
	$array1[] = $counter1;
	$counter1++;
}
$print_counter1 = 0;
while($print_counter1 < 10) {
	print($array1[array_rand($array1)] . "<br>\r\n");
	$print_counter1++;
}

print("<br>\r\n<br>\r\n2000 element array; apparently no problem<br>\r\n");
$array1 = array();
$counter1 = 0;
while($counter1 < 2000) {
	$array1[] = $counter1;
	$counter1++;
}
$print_counter1 = 0;
while($print_counter1 < 10) {
	print($array1[array_rand($array1)] . "<br>\r\n");
	$print_counter1++;
}

print("<br>\r\n<br>\r\n10000 element array; apparent problem: mild repetition<br>\r\n");
$array1 = array();
$counter1 = 0;
while($counter1 < 10000) {
	$array1[] = $counter1;
	$counter1++;
}
$print_counter1 = 0;
while($print_counter1 < 10) {
	print($array1[array_rand($array1)] . "<br>\r\n");
	$print_counter1++;
}

print("<br>\r\n<br>\r\n20000 element array; apparent problem: repetition<br>\r\n");
$array1 = array();
$counter1 = 0;
while($counter1 < 20000) {
	$array1[] = $counter1;
	$counter1++;
}
$print_counter1 = 0;
while($print_counter1 < 10) {
	print($array1[array_rand($array1)] . "<br>\r\n");
	$print_counter1++;
}

print("<br>\r\n<br>\r\n30000 element array; apparent problem: repetition<br>\r\n");
$array1 = array();
$counter1 = 0;
while($counter1 < 30000) {
	$array1[] = $counter1;
	$counter1++;
}
$print_counter1 = 0;
while($print_counter1 < 10) {
	print($array1[array_rand($array1)] . "<br>\r\n");
	$print_counter1++;
}

print("<br>\r\n<br>\r\n50000 element array; apparent problem: repetition<br>\r\n");
$array1 = array();
$counter1 = 0;
while($counter1 < 50000) {
	$array1[] = $counter1;
	$counter1++;
}
$print_counter1 = 0;
while($print_counter1 < 10) {
	print($array1[array_rand($array1)] . "<br>\r\n");
	$print_counter1++;
}

print("<br>\r\n<br>\r\n100000 element array; 32767=2^15-1 repeating; <br>\r\n");
$array1 = array();
$counter1 = 0;
while($counter1 < 100000) {
	$array1[] = $counter1;
	$counter1++;
}
$print_counter1 = 0;
while($print_counter1 < 10) {
	print($array1[array_rand($array1)] . "<br>\r\n");
	$print_counter1++;
}

print("<br>\r\n<br>\r\n200000 element array; 32767=2^15-1 repeating; <br>\r\n");
$array1 = array();
$counter1 = 0;
while($counter1 < 200000) {
	$array1[] = $counter1;
	$counter1++;
}
$print_counter1 = 0;
while($print_counter1 < 10) {
	print($array1[array_rand($array1)] . "<br>\r\n");
	$print_counter1++;
}

print("<br>\r\n<br>\r\n300000 element array; 32767=2^15-1 repeating; <br>\r\n");
$array1 = array();
$counter1 = 0;
while($counter1 < 300000) {
	$array1[] = $counter1;
	$counter1++;
}
$print_counter1 = 0;
while($print_counter1 < 10) {
	print($array1[array_rand($array1)] . "<br>\r\n");
	$print_counter1++;
}

?>

produces this output:

20 element array; apparently no problem
16
5
11
9
17
7
15
2
8
9


200 element array; apparently no problem
43
25
147
127
127
2
109
14
67
165


2000 element array; apparently no problem
26
1513
1882
1721
590
917
1237
596
409
1170


10000 element array; apparent problem: mild repetition
2661
6633
8864
1157
2432
6681
6995
6633
8864
1157


20000 element array; apparent problem: repetition
2432
13677
15498
3590
13677
15498
3590
13677
15498
3590


30000 element array; apparent problem: repetition
13677
15498
3590
13677
15498
3590
13677
15498
3590
13677


50000 element array; apparent problem: repetition
19089
29176
3590
29176
3590
29176
3590
29176
3590
29176


100000 element array; 32767=2^15-1 repeating;
3590
32767
32767
32767
32767
32767
32767
32767
32767
32767


200000 element array; 32767=2^15-1 repeating;
32767
32767
32767
32767
32767
32767
32767
32767
32767
32767


300000 element array; 32767=2^15-1 repeating;
32767
32767
32767
32767
32767
32767
32767
32767
32767
32767

for me. I do not know the exact problem although the randomization seems progressively worse on larger arrays.
 [2014-02-15 15:49 UTC] timo dot fiersen at web dot de
Looks like my previous comment got lost... I was wondering if this is going to be fixed some day, it seems to exist for ages already?

Or did this maybe just popped up again, because I'm experiencing the exact same problem with 5.5.x, calling array_random() kills all randomness.

PHP: 5.5.6 and 5.5.9 (both TS)
OS: Windows 7 x64
 [2014-05-08 14:19 UTC] levim@php.net
Bug https://bugs.php.net/bug.php?id=67233 is a duplicate of this one.
 [2015-07-30 11:32 UTC] cmb@php.net
-Status: Assigned +Status: Analyzed
 [2015-07-30 11:32 UTC] cmb@php.net
The problem is the way array_rand() works, in combination with the
limited random number range available on Windows. The function
loops over all elements[1], calculating a new random number for
each, and checks whether to draw the current element[2]. However,
on Windows PHP_RAND_MAX == 32767, so this condition is likely to
be false for large num_avail. Particularly, when num_req == 1,
what is the default, the condition *can* only be true if either
randval == 0 or num_avail < PHP_RAND_MAX+1; the latter case
requires randval to be rather small still.

In practise, randval is always equal to zero for the OP's second
test script, so the random generator is always seeded to zero for
the next random operation.

On Linux, PHP_RAND_MAX == (2**31)-1, so this algorithm is less of
a problem, but still there may be issues for *very* large arrays.
If we can ignore these (so large an array won't easily fit into
memory), a solution would be to use php_mt_rand() instead of
php_rand() (and to seed the the MT random number generator
automatically).

[1] <https://github.com/php/php-src/blob/php-7.0.0beta2/ext/standard/array.c#L4547-L4573>
[2] <https://github.com/php/php-src/blob/php-7.0.0beta2/ext/standard/array.c#L4554>
 [2015-08-05 12:09 UTC] cmb@php.net
-Summary: Serious flaw in random related functions +Summary: Serious flaw in array_rand()
 [2015-08-05 12:09 UTC] cmb@php.net
>> What about merging a patch that circulated in @internals that
>> made rand() and alias to mt_rand() and be done with this ?
>
> Because it may not fix the problem? (see the other report today
> and two weeks ago).

For reference, these reports are bug #45302 (which is a duplicate
of this ticket) and bug #45184 (which is about the scaling issue
that affects rand() as well as mt_rand(); see also PR #1416[1]).

[1] <https://github.com/php/php-src/pull/1416>
 [2016-09-09 02:45 UTC] yohgaki@php.net
https://wiki.php.net/rfc/rng_fixes
Should be fixed by this.
 [2016-09-09 05:43 UTC] yohgaki@php.net
-Status: Analyzed +Status: Closed
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Wed Oct 09 10:01:27 2024 UTC