|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2009-01-26 17:58 UTC] sgnutzmann at yahoo dot de
Description: ------------ I use the function array_diff() to compare two sorted string-arrays with numerical keys (array sizes are 76,906 and 433,959, string sizes in all array elements less than 20 characters). With PHP 5.2.4 the function returns very fast (just few seconds), with PHP 5.2.8 I kill PHP.exe after 30 minutes(!) without result. PHP.INI: memory_limit = 1536M extension=php_pdo.dll extension=php_zip.dll extension=php_pdo_odbc.dll Reproduce code: --------------- // $Sales and $Inv read previously from file system $idSales = array(); foreach ( $Sales as $i => $data ) $idSales[$i] = '#'.$data[2]; array_multisort ($idSales, $Sales); $idInv = array(); foreach ( $Inv as $i => $data ) $idInv[$i] = '#'.$data[1]; array_multisort ($idInv, $Inv); echo "Start array_diff\n"; $unknown = array_diff ( $idSales, $idInv ); echo "End array_diff\n"; Expected result: ---------------- see description Actual result: -------------- no result in 30 minutes PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Sun Nov 30 11:00:02 2025 UTC |
Complete test script (size of generated test file 5,865 KB) <?php $handle = fopen('TestData.txt','rb'); // size of first array $buffer = fgets($handle, 256); $buffer = str_replace("\r",'',$buffer); $buffer = str_replace("\n",'',$buffer); $count = (int) $buffer; echo 'Size of first array: '.$count."\r\n"; // elements of first array $idSales = array(); for ( $i = 0; $i < $count; $i++ ) { $buffer = fgets($handle, 256); $buffer = str_replace("\r",'',$buffer); $buffer = str_replace("\n",'',$buffer); $idSales[] = $buffer; } // for ( $i = 0; $i < $count; $i++ ) // size of second array $buffer = fgets($handle, 256); $buffer = str_replace("\r",'',$buffer); $buffer = str_replace("\n",'',$buffer); $count = (int) $buffer; echo 'Size of second array: '.$count."\r\n"; // elements of second array $idInv = array(); for ( $i = 0; $i < $count; $i++ ) { $buffer = fgets($handle, 256); $buffer = str_replace("\r",'',$buffer); $buffer = str_replace("\n",'',$buffer); $idInv[] = $buffer; } // for ( $i = 0; $i < $count; $i++ ) fclose($handle); echo "Start of array_diff\r\n"; $unknown = array_diff ( $idSales, $idInv ); echo 'Number of unknown identifier '.count($unknown)."\r\n"; ?> First lines of test file: 76906 #00/1109 #00/1162 #00/1163 #00/1335 #00/1337 Result, if I use PHP 5.2.4: Size of first array: 76906 Size of second array: 433959 Start of array_diff Number of unknown identifier 17826 No result from array_diff, if I use PHP 5.2.8 (without any extension)Could reproduce. This code shows the time taken to array_diff two arrays for the builtin array_diff and for a PHP function fast_array_diff I wrote. <?php $a = $b = array(); for ($i = 0; $i < 10000; $i++) { $a[] = "s" . ($i * 102121 % 433061); $b[] = "s" . ($i * 102121 % 433003); } $start = microtime(true); $res1 = array_diff($a, $b); echo "Built-in array_diff duration: ".(microtime(true) - $start)."\n"; include('http://www.gissen.nl/files/fast_array_diff.php'); $start = microtime(true); $res2 = fast_array_diff($a, $b); echo "Fast_array_diff duration: ".(microtime(true) - $start)."\n"; sort($res1); sort($res2); assert($res1 == $res2); ?> Output: Built-in array_diff duration: 11.8710849285 Fast_array_diff duration: 0.254959106445