php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #37168 WDDX serializer inefficient with larger structures
Submitted: 2006-04-22 17:35 UTC Modified: 2007-08-10 19:36 UTC
From: dolecek at sky dot cz Assigned:
Status: Closed Package: WDDX related
PHP Version: 5.2.1-dev OS: Windows, NetBSD
Private report: No CVE-ID: None
 [2006-04-22 17:35 UTC] dolecek at sky dot cz
Description:
------------
We use WDDX for persistent storage on a project, the production system runs on MS Windows. The structure is typically small array of several asociative arrays. We recently noticed that bigger arrays takes significantly more time to serialize then small ones. Increasing the size of array twice resulted to about 5-10 times time increase, with even bigger increase as the size increased.

Problem boils down do smart string API, used by WDDX. WDDX uses it to store the serialization results.

With default sizes, the initial smart string size is 76 bytes and the buffer grows only by 128+1 bytes. When the buffer size grows beyond trivial sizes, the whole smart_string_appendl() et.al. starts being dominated by the time spend reallocating the buffer, i.e. realloc() and the associated memory-to-memory copies and CPU cache trashing.

I've tried two strategies to alleviate the problem.

1. increase the buffer grow size to 4192 bytes
2. enforce power-of-2 size of the buffer, and always
   at least double the size of buffer

Many malloc()/realloc() implementation optimize power-of-2 sizes, and the doubling also ensures the total number of calls to realloc() and associated memory trashing is minimized.

1) helps a lot, but the time increase is still not prortional to the array size increase.

2) fixes the problem, the time increase is mostly exactly proportional to the size increase.

Note - the same problem has been observed for standard serializer, i.e. serialize(). Briefly looking at source it seems the problem is the same as for WDDX.

Patch for 1):

--- ext/wddx/wddx.c.orig        2006-04-22 19:27:19.000000000 +0200
+++ ext/wddx/wddx.c
@@ -22,6 +22,8 @@
 
 #if HAVE_WDDX
 
+#define SMART_STR_PREALLOC     4192
+
 #include "ext/xml/expat_compat.h"
 #include "php_wddx.h"
 #include "php_wddx_api.h"
                   256

Patch for 2):
--- ext/wddx/wddx.c.orig        2006-01-01 13:50:16.000000000 +0100
+++ ext/wddx/wddx.c
@@ -38,2 +44,21 @@

+#undef SMART_STR_DO_REALLOC
+#define SMART_STR_DO_REALLOC(d, what) \
+       (d)->c = SMART_STR_REALLOC((d)->c, (d)->a, (what))
+
+#undef smart_str_alloc4
+#define smart_str_alloc4(d, n, what, newlen) do {                                      \
+       if (!(d)->c) {                                                                                                  \
+               (d)->len = 0;                                                                                           \
+               (d)->a = SMART_STR_START_SIZE; \
+       } \
+\
+               newlen = (d)->len + (n);                                                                        \
+               if (newlen >= (d)->a || !(d)->c) {                                                                              \
+                       while((d)->a < newlen+1)                                                                \
+                                       (d)->a += (d)->a; \
+                       SMART_STR_DO_REALLOC(d, what);                                                  \
+               }                                                                                                                       \
+} while (0)
+
 #define WDDX_BUF_LEN                   256


Reproduce code:
---------------
<?php

$item = array();

for($i=0; $i < 20; $i++)
        $item['item'.$i] = 'content '.$i;

$sample = array(50, 100, 200, 400, 800, 1600, 3200, 6400);

foreach($sample as $cnt) {
        // build the sample array to be serialized
        $var = array();
        for($i=0; $i < $cnt; $i++) {
                $var[] = $item;
        }

        $st = microtime(true);
        wddx_serialize_value($var);
        echo "$cnt: ".(microtime(true)-$st)."\n";
}


Expected result:
----------------
Expected is linear increase of time.

1) result:

50: 0.0034401416778564
100: 0.0071568489074707
200: 0.020088911056519
400: 0.10020208358765
800: 0.56227111816406
1600: 2.8473780155182
3200: 11.858402013779
6400: 48.511730909348

2) result:
50: 0.0030708312988281
100: 0.0057220458984375
200: 0.013122081756592
400: 0.03044605255127
800: 0.06494402885437
1600: 0.13331294059753
3200: 0.26938605308533
6400: 0.53914594650269



Actual result:
--------------
This is the initial (without patch) result:

50: 0.011401891708374
100: 0.052657127380371
200: 0.2442729473114
400: 2.2745268344879
800: 16.260624885559
1600: 86.947965145111

(3200 and 6400 skipped due to too long run)

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-04-22 17:38 UTC] dolecek at sky dot cz
The OS was set to NetBSD for this bug report because the tests were run on NetBSD.
 [2006-04-22 17:49 UTC] dolecek at sky dot cz
Oh, and the patch 2) should also include the #define SMART_STR_PREALLOC     4192, so that the initial size is power-of-2 (the test result was run with that part included).
 [2006-04-22 17:55 UTC] tony2001@php.net
The patch is definitely wrong, as this is just a hack overriding functions in this particular case only.
Fix the original functions instead (if there are any problems).

 [2006-04-22 18:42 UTC] dolecek at sky dot cz
It's a minimal intrusion patch, designed to pinpoint and show the problem - feel free to integrate whatever way is best for PHP project.
 [2006-04-23 16:36 UTC] iliaa@php.net
Please try using this CVS snapshot:

  http://snaps.php.net/php5.1-latest.tar.gz
 
For Windows:
 
  http://snaps.php.net/win32/php5.1-win32-latest.zip

I've tried your sample code and I see only linear increase in 
the execution time. And certainly at no point did it take 
seconds to execute each block.
 [2006-04-23 21:29 UTC] dolecek at sky dot cz
I tried the snapshot php5.1-200604232030 - results:

NetBSD 3.99.15 (compiled with ./configure --enable-wddx --with-libxml-dir=/usr/pkg):
50: 0.010740995407104
100: 0.053707838058472
200: 0.23849892616272
400: 2.4622600078583
800: 18.556435823441
1600: 94.584519863129

(3200 and 6400 skipped)


Tried also Windows version on Windows XP (same computer):
50: 0.0027320384979248
100: 0.0066189765930176
200: 0.024183988571167
400: 0.096118927001953
800: 0.57651996612549
1600: 2.4123661518097
3200: 8.9470641613007
6400: 34.234342098236

So PHP snapshot is ever worse on NetBSD 3.99.15 then PHP 5.1.2. On Windows XP, the increase is linear, but the time increase is non-proportional (2x size means ~4x time increase).

What OS did you try the test? Older BSD malloc() was power-of-2 internally and thus would not suffer from this usage pattern. The more modern version used on NetBSD has been adjusted to minimize total memory use, which gives the realloc() performance hit in this case. Perhaps modern GNU libc malloc()/realloc() handles this usage pattern better and that is the reason you don't see the problem?

BTW, if the patch 2) or it's variant is adopted, it would be useful to use some suitable power-of-2 value for SMART_STR_START_SIZE (to get multiple of page size and optimal memory use of last page), rather then changing SMART_STR_PREALLOC (which is unused in my patch).
 [2006-04-23 21:32 UTC] dolecek at sky dot cz
Also, it would be useful to know if my patch improves performance on your system (even though the time increase is linear even without my patch on your system). Could you include results without and with patch on your system, please?
 [2006-05-02 13:36 UTC] iliaa@php.net
I tried your patch on linux 2.6 and MacOSX 10.3.4 and in both 
cases had no visible difference (beyond the margin of error) 
in terms of speed and time taken to execute the code.
 [2006-05-02 19:33 UTC] dolecek at sky dot cz
Fine, the difference is entirely dependant on platform realloc() implementation. Anyone can try the patch on MS Windows? I don't have native compiler there. Thanks.
 [2006-05-23 22:07 UTC] dolecek at sky dot cz
Changing the OS to Windows, NetBSD.
 [2006-05-27 09:28 UTC] dolecek at sky dot cz
I figured I used a malloc() debug option when I run the test on NetBSD (/etc/malloc.conf -> 'J'), which caused each realloc() call to explicitly malloc() new piece of memory and free the old. With that disabled, the WDDX serializer result is much better on NetBSD (this is without the patch, run with PHP 5.1.4) and actually can complete the benchmark:

wddx_serialize_value() (without /etc/malloc.conf -> J):
0050:  0.0027
0100:  0.0058
0200:  0.0152
0400:  0.0724
0800:  0.3820
1600:  1.6633
3200:  6.9623
6400: 28.6273

For comparison, this is result with standard serialize():
0050:  0.0016
0100:  0.0032
0200:  0.0092
0400:  0.0275
0800:  0.1174
1600:  0.5613
3200:  2.3775
6400:  9.8372

So, the WDDX serializer still shows strong non-linear behaviour.

wddx_serialize_value() with patch (and without /etc/malloc.conf->J):
0050:  0.0026
0100:  0.0048
0200:  0.0104
0400:  0.0227
0800:  0.0476
1600:  0.0933
3200:  0.1914
6400:  0.3859

serialize() result after applying similar patch to ext/standard/var.c:
0050:  0.0015
0100:  0.0028
0200:  0.0061
0400:  0.0151
0800:  0.0320
1600:  0.0670
3200:  0.1374
6400:  0.2744
 [2006-11-13 21:40 UTC] iliaa@php.net
Please try using this CVS snapshot:

  http://snaps.php.net/php5.2-latest.tar.gz
 
For Windows:
 
  http://snaps.php.net/win32/php5.2-win32-latest.zip

There have been major changes in the memory allocator for 
Win32 which should help with the performance.
 [2006-11-15 22:14 UTC] dolecek at sky dot cz
The result with 5.2.1-dev is somewhat erratic and actually worse then before:

50: 0.0067908763885498
100: 0.013392925262451
200: 0.027522087097168
400: 0.055379152297974
800: 1.2351222038269
1600: 0.25271010398865
3200: 0.54318714141846
6400: 57.312628030777

I've run the test several times with same results. It's especially strange '800' test is consistenly so much slower then 1600 test, and this repeats on every run.

Seems the Windows it is actually a lot worse with the new Windows allocator then what was there before, or perhaps the new allocator doesn't handle the realloc() calls generated by smart string macros too well. It would be interesting to try how much difference would my patch make on Windows.
 [2007-08-10 19:36 UTC] dolecek at sky dot cz
WDDX serializer in PHP 5.2.3 appears to show ~linear time increase on Windows XP. Thanks!
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu May 30 05:01:31 2024 UTC