php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #70754 array_udiff should compare the same values only once
Submitted: 2015-10-21 01:40 UTC Modified: 2021-08-06 12:48 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:0 of 1 (0.0%)
From: php_bugs at multiwebinc dot com Assigned:
Status: Open Package: Performance problem
PHP Version: 7.0.0RC5 OS:
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: php_bugs at multiwebinc dot com
New email:
PHP Version: OS:

 

 [2015-10-21 01:40 UTC] php_bugs at multiwebinc dot com
Description:
------------
All versions of PHP seem to be inefficient at executing array_udiff. The values in $a in the callback function are all compared twice. If the callback function is expensive, this would affect performance.

Also, some comparisons are unnecessary.

Test script:
---------------
/* Note: The callback function just returns 0 every time, which means that all values in both arrays are considered equal. */

function callback($a, $b) {
    echo "\$a = $a; \$b= $b; ";
    echo "<br>\n";
    return 0;
}

$array1 = array("first_0", "first_1", "first_2", "first_3", "first_4" );
$array2 = array("second_0", "second_1", "second_2", "second_3", "second_4");
$result = array_udiff($array1, $array2, "callback");

Actual result:
--------------
$a = first_2; $b= first_1;
$a = first_4; $b= first_2;
$a = first_2; $b= first_0;
$a = first_3; $b= first_2;
$a = first_4; $b= first_3; // Unnecessary since we already know these are equal
$a = first_1; $b= first_0; // Unnecessary since we already know these are equal
$a = second_2; $b= second_1;
$a = second_4; $b= second_2;
$a = second_2; $b= second_0;
$a = second_3; $b= second_2;
$a = second_4; $b= second_3; // Unnecessary since we already know these are equal
$a = second_1; $b= second_0; // Unnecessary since we already know these are equal
$a = first_4; $b= second_4;
$a = first_4; $b= first_3; // Duplicate. Already tested above.
$a = first_3; $b= first_2; // Duplicate. Already tested above.
$a = first_2; $b= first_1; // Duplicate. Already tested above.
$a = first_1; $b= first_0; // Duplicate. Already tested above.

Patches

CPVsri (last revision 2015-10-21 09:48 UTC by mandersonaugusto at bol dot com dot br)

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2021-08-06 12:48 UTC] cmb@php.net
-Summary: array_udiff compares the same values multiple times +Summary: array_udiff should compare the same values only once -Type: Bug +Type: Feature/Change Request
 [2021-08-06 12:48 UTC] cmb@php.net
First, the details are a bit different as of PHP 7.0.0[1] (small
arrays use insertion sort instead of quicksort), but generally,
yes, there may be "unnecessary" comparisons.  The point is, that
for efficiency reasons, array_udiff() sorts all arrays first, and
only afterwards calculates the actual diff.  None of the comparison
results are cached, so you suffer the additional function call
overhead for the usually cheap comparisons (for rather expensive
comparisons, you can cache the results yourself).  So there may be
room for improvement, but there is no bug per se.

[1] <https://3v4l.org/LBeQB>
 
PHP Copyright © 2001-2022 The PHP Group
All rights reserved.
Last updated: Thu Jul 07 14:03:34 2022 UTC