|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #47643 array_diff() takes over 3000 times longer than php 5.2.4
Submitted: 2009-03-13 11:49 UTC Modified: 2010-11-01 18:18 UTC
Avg. Score:4.7 ± 0.6
Reproduced:28 of 29 (96.6%)
Same Version:22 (78.6%)
Same OS:17 (60.7%)
From: viper7 at viper-7 dot com Assigned: felipe (profile)
Status: Closed Package: Performance problem
PHP Version: 5.*, 6CVS (2009-04-13) OS: *
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Bug Type:
From: viper7 at viper-7 dot com
New email:
PHP Version: OS:


 [2009-03-13 11:49 UTC] viper7 at viper-7 dot com
This bug was reported in ##php on freenode, and after some thorough testing on multiple machines we determined it must be an engine bug.

array_diff on two large arrays of md5 hashes (600,000 elements each) takes approximately 4 seconds on a fast server in PHP 5.2.4 and below (confirmed with PHP 5.2.0), but over 4 hours (!) on PHP 5.2.6 and greater (confirmed with PHP 5.2.9 and PHP 5.3.0 beta2)

Reproduce code:
$i=0; $j=500000;
while($i < 600000) {
	$i++; $j++;
	$data1[] = md5($i);
	$data2[] = md5($j);
$time = microtime(true);

echo "Starting array_diff\n";
$data_diff1 = array_diff($data1, $data2);

$time = microtime(true) - $time;

echo 'array_diff() took ' . number_format($time, 3) . ' seconds and returned ' . count($data_diff1) . " entries\n";

Expected result:
Starting array_diff
array_diff() took 3.778 seconds and returned 500000 entries

Actual result:
Starting array_diff
array_diff() took 14826.278 seconds and returned 500000 entries


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2009-03-24 21:19 UTC] cisa at cisa85 dot de
Like I described [1] I use this function to get the performance I need:

function array_diff_fast($data1, $data2) {
    $data1 = array_flip($data1);
    $data2 = array_flip($data2);

    foreach($data2 as $hash => $key) {
       if (isset($data1[$hash])) unset($data1[$hash]);

    return array_flip($data1);

Thanks to Viper for his help.

 [2009-06-30 15:19 UTC] viper7 at viper-7 dot com
I've tracked down the change that broke things, this is it. but the exact reason is beyond me heh. Hopefully this helps.
 [2009-06-30 15:22 UTC]
Dmitry, could you have a look? I have no idea why this occurs.
 [2009-07-01 15:32 UTC]
The problems occurs because of "bad" patch for bug #42838.

The diff algorithm sorts arrays using qsort and then assumes that they are sorted correctly. But in case of user compaison function it can't be guaranteed. Thus in ext/standard/tests/array/bug42838.phpt key_compare_func() can't sort array correctly because expressions (0 < 'a') and (0 > 'a') both false ('a' is interpreted as a number 0).

It should be fixed in some way
 [2009-07-09 20:38 UTC]
As Dmitry's noted, this is side-effect your fix caused.
 [2010-01-17 12:09 UTC] emiel dot bruijntjes at copernica dot com
This bug is now open for 10 months. Are you still working on this?
 [2010-02-17 20:53 UTC] maarten at talkin dot nl
Why dont you only reset ptr if (behavior & DIFF_ASSOC) ?
 [2010-04-16 22:20 UTC] sylvain at jamendo dot com
I would also appreciate a patch, this issue made our servers crash after a php 5.3 
upgrade :-/

 [2010-08-04 05:21 UTC] lonnyk at gmail dot com
I feel as though the actual bug here is the fix that caused this issue.  If you 
revert the fix and typecast the variables passed into the custom compare function 
as (string) then this works fine.  This is in line with other non-user defined 
comparison functions, they compare as === and not ==
 [2010-11-01 18:16 UTC]
Automatic comment from SVN on behalf of felipe
Log: - Fixed bug #47643 (array_diff() takes over 3000 times longer than php 5.2.4)
 [2010-11-01 18:18 UTC]
-Status: Assigned +Status: Closed
 [2010-11-01 18:18 UTC]
This bug has been fixed in SVN.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
Thank you for the report, and for helping us make PHP better.

 [2010-11-11 22:25 UTC]
Automatic comment from SVN on behalf of felipe
Log: - Fixed bug #47643 (array_diff() takes over 3000 times longer than php 5.2.4)
 [2011-02-23 14:56 UTC] jaromir dot dolecek at skype dot net
Looking at the fix, the same problem seems to be possible to happen with 
DIFF_ASSOC option.
 [2014-11-28 14:39 UTC] samantha at adrichem dot nu
Could this fix be the reason why (since i can't find anything else in the changelog) array_diff() now (php 5.6) does string comparisons and no longer supports multidimensional arrays, whilst in php 5.3.23 it does? (though documentation says it doesn't)
 [2014-11-28 14:47 UTC] samantha at adrichem dot nu
Never mind, it just didn't generate a notice array to string conversion, now it does
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Mon Apr 15 13:01:32 2024 UTC