php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #60836 Improve array_intersect_key performance
Submitted: 2012-01-22 00:34 UTC Modified: -
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: oliver at hofff dot com Assigned:
Status: Open Package: Arrays related
PHP Version: Irrelevant OS:
Private report: No CVE-ID: None
 [2012-01-22 00:34 UTC] oliver at hofff dot com
Description:
------------
The trivial test script below runs longer than the extpected "instant" execution (lot longer).

This is because the algorithm walks the first argument and looks up the key in every other array supplied as an argument.

Instead it should walk the shortest array given, and look up the keys in every other array.

Maybe this issue or similar ones also apply to other array functions, which perform set operations, but I have not checked the code of them.

Of course the optimization could be done in userland, but that feels not right.

Test script:
---------------
$arr = array_fill(0, 1000000, '...');
$i = 1000;
while($i--) {
array_intersect_key($arr, array());
}


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2012-01-27 03:52 UTC] carloschilazo at gmail dot com
we cant start walking the shortest array given, because we first need to know 
which elements we need to look in the other arrays, and those are the ones 
presesnt in the first array

" returns an array containing all the entries of array1 which have keys that are 
present in all the arguments. "
 [2012-03-25 17:27 UTC] phristen at yahoo dot com
carloschilazo, what you said doesn't make sense.
Intersecting A, B, and C will produce the same result as intersecting C, B, and A, or any other combination of the operands.

Unfortunately, a lot of PHP array functions are known to be extremely inefficient :(
 [2012-03-25 17:31 UTC] phristen at yahoo dot com
Actually sorry, it does make sense...
But it's easily solvable. Just run the shortest array against the first array (unless they happen to be the same) to get the values, then run the result against the rest of them.
 [2017-11-26 16:31 UTC] christoph at burschka dot de
Adding this optimization would make array_intersect_key() much more useful.

Right now, "look up a small selection of keys in a large dictionary of values, and return the looked-up entries" is only efficiently done with a loop, because array_intersect_key() uselessly loops over the dictionary.

If the function iterated over the shortest array (and then looked up each key in the source array), the only thing that would change is the implementation detail that the result no longer maintains the order of the source array.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Mon Dec 30 14:01:28 2024 UTC