php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #77556 COUNT_RECURSIVE is suprisingly slow
Submitted: 2019-02-01 14:52 UTC Modified: 2019-02-01 14:58 UTC
From: ivan dot enderlin at hoa-project dot net Assigned:
Status: Not a bug Package: Performance problem
PHP Version: 7.3.1 OS: macOS
Private report: No CVE-ID: None
 [2019-02-01 14:52 UTC] ivan dot enderlin at hoa-project dot net
Description:
------------
I discovered `COUNT_RECURSIVE` and thought it would speed up my algorithm. But instead, it slowed it down. It initially used `array_sum` + `array_map`. See the test script.

This result is surprising because `array_map` creates a new array, so the memory allocations should be slower than just iterating and counting. It proofs to be true with the last test in the script (see “manual”) where it is faster than all the previous solutions.

Test script:
---------------
<?php

const LENGTH = 100000;

$a = array_fill(0, LENGTH, array_fill(0, LENGTH, 42));

echo 'expected:            '; var_dump(LENGTH * LENGTH);
echo "\n\n";

$_ = microtime(true);
echo 'count recursive:     '; var_dump(count($a, COUNT_RECURSIVE) - count($a));
echo 'time: ', number_format((microtime(true) - $_) / 1000, 6), "\n\n";

$_ = microtime(true);
echo 'sum + map:           '; var_dump(array_sum(array_map('count', $a)));
echo 'time: ', number_format((microtime(true) - $_) / 1000, 6), "\n\n";

$_ = microtime(true);
echo 'sum + map + closure: '; var_dump(array_sum(array_map(function (array $v) { return count($v); }, $a)));
echo 'time: ', number_format((microtime(true) - $_) / 1000, 6), "\n\n";

$_ = microtime(true);
echo 'manual:              ';
$i = 0;
foreach ($a as $aa) {
    $i += count($aa);
}
var_dump($i);
echo 'time: ', number_format((microtime(true) - $_) / 1000, 6), "\n\n";


/**
 * Output:
 *
 *     expected:            int(10000000000)
 *
 *
 *     count recursive:     int(10000000000)
 *     time: 0.012170
 *
 *     sum + map:           int(10000000000)
 *     time: 0.000005
 *
 *     sum + map + closure: int(10000000000)
 *     time: 0.000007
 *
 *     manual:              int(10000000000)
 *     time: 0.000002
 */


Expected result:
----------------
`COUNT_RECURSIVE` should be as fast as or faster than `array_sum` + `array_map`.

`COUNT_RECURSIVE` should be as fast as the “manual” test, or even faster because it's written in C directly.

Actual result:
--------------
`COUNT_RECURSIVE` is way too slow :-).

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-02-01 14:58 UTC] nikic@php.net
-Status: Open +Status: Not a bug
 [2019-02-01 14:58 UTC] nikic@php.net
COUNT_RECURSIVE also needs to iterate over the inner arrays to check whether they contain arrays that also need to be recursively counted. You are using domain knowledge (that arrays are singly-nested) to avoid that extra recursion step, and can thus do this much more efficiently. It's basically 100000 vs 100000^2 operations.
 [2019-02-01 15:13 UTC] ivan dot enderlin at hoa-project dot net
Indeed.

```
function f(array $a): int {
    $i = count($a);

    foreach ($a as $aa) {
        if (is_array($aa)) {
            $i += f($aa);
        }
    }

    return $i;
}
```

is 10 times slower than `COUNT_RECURSIVE` and does exactly the same.

I guess having a type to represent homogeneous arrays would help a lot here.

Thanks for your answer.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Oct 06 17:01:27 2024 UTC