php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #77556 COUNT_RECURSIVE is suprisingly slow
Submitted: 2019-02-01 14:52 UTC Modified: 2019-02-01 14:58 UTC
From: ivan dot enderlin at hoa-project dot net Assigned:
Status: Not a bug Package: Performance problem
PHP Version: 7.3.1 OS: macOS
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please !
Your email address:
MUST BE VALID
Solve the problem:
43 - 42 = ?
Subscribe to this entry?

 
 [2019-02-01 14:52 UTC] ivan dot enderlin at hoa-project dot net
Description:
------------
I discovered `COUNT_RECURSIVE` and thought it would speed up my algorithm. But instead, it slowed it down. It initially used `array_sum` + `array_map`. See the test script.

This result is surprising because `array_map` creates a new array, so the memory allocations should be slower than just iterating and counting. It proofs to be true with the last test in the script (see “manual”) where it is faster than all the previous solutions.

Test script:
---------------
<?php

const LENGTH = 100000;

$a = array_fill(0, LENGTH, array_fill(0, LENGTH, 42));

echo 'expected:            '; var_dump(LENGTH * LENGTH);
echo "\n\n";

$_ = microtime(true);
echo 'count recursive:     '; var_dump(count($a, COUNT_RECURSIVE) - count($a));
echo 'time: ', number_format((microtime(true) - $_) / 1000, 6), "\n\n";

$_ = microtime(true);
echo 'sum + map:           '; var_dump(array_sum(array_map('count', $a)));
echo 'time: ', number_format((microtime(true) - $_) / 1000, 6), "\n\n";

$_ = microtime(true);
echo 'sum + map + closure: '; var_dump(array_sum(array_map(function (array $v) { return count($v); }, $a)));
echo 'time: ', number_format((microtime(true) - $_) / 1000, 6), "\n\n";

$_ = microtime(true);
echo 'manual:              ';
$i = 0;
foreach ($a as $aa) {
    $i += count($aa);
}
var_dump($i);
echo 'time: ', number_format((microtime(true) - $_) / 1000, 6), "\n\n";


/**
 * Output:
 *
 *     expected:            int(10000000000)
 *
 *
 *     count recursive:     int(10000000000)
 *     time: 0.012170
 *
 *     sum + map:           int(10000000000)
 *     time: 0.000005
 *
 *     sum + map + closure: int(10000000000)
 *     time: 0.000007
 *
 *     manual:              int(10000000000)
 *     time: 0.000002
 */


Expected result:
----------------
`COUNT_RECURSIVE` should be as fast as or faster than `array_sum` + `array_map`.

`COUNT_RECURSIVE` should be as fast as the “manual” test, or even faster because it's written in C directly.

Actual result:
--------------
`COUNT_RECURSIVE` is way too slow :-).

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-02-01 14:58 UTC] nikic@php.net
-Status: Open +Status: Not a bug
 [2019-02-01 14:58 UTC] nikic@php.net
COUNT_RECURSIVE also needs to iterate over the inner arrays to check whether they contain arrays that also need to be recursively counted. You are using domain knowledge (that arrays are singly-nested) to avoid that extra recursion step, and can thus do this much more efficiently. It's basically 100000 vs 100000^2 operations.
 [2019-02-01 15:13 UTC] ivan dot enderlin at hoa-project dot net
Indeed.

```
function f(array $a): int {
    $i = count($a);

    foreach ($a as $aa) {
        if (is_array($aa)) {
            $i += f($aa);
        }
    }

    return $i;
}
```

is 10 times slower than `COUNT_RECURSIVE` and does exactly the same.

I guess having a type to represent homogeneous arrays would help a lot here.

Thanks for your answer.
 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Sat Feb 29 04:01:28 2020 UTC