php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #52832 unserialize() performance
Submitted: 2010-09-14 02:46 UTC Modified: 2010-09-20 02:04 UTC
From: galaxy dot mipt at gmail dot com Assigned: kalle (profile)
Status: Closed Package: Performance problem
PHP Version: 5.3.3 OS: Linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: galaxy dot mipt at gmail dot com
New email:
PHP Version: OS:

 

 [2010-09-14 02:46 UTC] galaxy dot mipt at gmail dot com
Description:
------------
Performance of built-in unserializer degrades at unexpectedly high rate with the increase of unserialized data size (rather, with number of serialized items). Say, unserializing a plain array of ~1000000 integers might take somewhat 10 secs on average P4 machine, and the worst part is that the time raises quadratically (O(n^2)) with the array size, i.e. ~2000000-ish array would take 40 secs or so.

The main performance killer is var_hash linked list where every extracted variable is pushed. It is looked up sequentally from the very beginning up to, in fact, the very end during every push operation (var_push() in ext/standard/var_unserializer.c). It appears that looking from the end (or just storing last used element elsewhere) would save a lot of cycles.

In my tests doing so reduced the unserialize time from 7 secs to ~0.3 sec on 1000000-size array and size dependency apparently changed to something more like O(n*log(n))


Patches

php-5.3-unserialize-performance.patch (last revision 2010-09-14 16:32 UTC by galaxy dot mipt at gmail dot com)

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-09-14 02:58 UTC] cataphract@php.net
> In my tests doing so reduced the unserialize time from 7 secs to ~0.3 sec on 1000000-size array and size dependency apparently changed to something more like O(n*log(n))

Could you submit a patch with that modification and a test script that exemplifies the speedup?
 [2010-09-14 18:36 UTC] galaxy dot mipt at gmail dot com
Added a patch against latest SVN version, did things in a way that required least code modification.

Here goes the test script:

<?php
ini_set('memory_limit', '512M');

$sizes = array(100000, 200000, 500000, 1000000);

foreach($sizes as $N) {

    $data = array();
    for($i=0; $i < $N; $i++) $data[] = mt_rand();

    $timeSerialize = 0;
    $timeUnserialize = 0;

    for($run=0; $run < 10; $run++) {

        $ts = microtime(1);
        $ser = serialize($data);
        $timeSerialize +=  microtime(1) - $ts;

        $ts =  microtime(1);
        $unser = unserialize($ser);
        $timeUnserialize +=  microtime(1) - $ts;

        if (count($data) != count($unser)) print "Error: array sizes mismatch\n";
        for($i=0; $i < $N; $i++)
            if (!isset($unser[$i]) || $data[$i] != $unser[$i])
                print "Error: array elements mismatch\n";

        unset($ser);
        unset($unser);
    }

    print "Size: $N\t\tSerialize: " . (floor(1000*$timeSerialize)) . "ms\t\tUnserialize: " . (floor(1000*$timeUnserialize)) . "ms\n\n";

}
?>


It's a bit memory consuming, so array sizes might need to be reduced depending on available hardware.

My test results:

Original PHP:
Size: 100000            Serialize: 483ms                Unserialize: 470ms

Size: 200000            Serialize: 1047ms               Unserialize: 1308ms

Size: 500000            Serialize: 2638ms               Unserialize: 14360ms

Size: 1000000           Serialize: 6319ms               Unserialize: 72744ms

Patched PHP:
Size: 100000            Serialize: 500ms                Unserialize: 357ms

Size: 200000            Serialize: 870ms                Unserialize: 703ms

Size: 500000            Serialize: 2212ms               Unserialize: 1315ms

Size: 1000000           Serialize: 4898ms               Unserialize: 2823ms
 [2010-09-15 04:46 UTC] kalle@php.net
-Status: Open +Status: Assigned -Assigned To: +Assigned To: kalle
 [2010-09-15 04:46 UTC] kalle@php.net
Hi we cannot merge this into 5.3, as it changes a structure (php_unserialize_data) thats exported to extensions in a type, breaking the ABI. But without a doubt it should go in trunk atleast.
 [2010-09-18 18:09 UTC] kalle@php.net
Implemented in trunk, thanks for your work.
 [2010-09-20 02:04 UTC] kalle@php.net
-Status: Assigned +Status: Closed
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 23:01:29 2024 UTC