php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #72736 Slow performance when fetching large dataset with mysqli / PDO
Submitted: 2016-08-02 14:55 UTC Modified: 2016-10-26 12:29 UTC
Votes:2
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:1 (100.0%)
From: petr dot hrabal at gmail dot com Assigned: dmitry (profile)
Status: Closed Package: *Database Functions
PHP Version: 7.0.9, 7.1.0-dev OS: Debian 8
Private report: No CVE-ID: None
 [2016-08-02 14:55 UTC] petr dot hrabal at gmail dot com
Description:
------------
I have large MySQL query (1.8M rows, 25 columns -  all string between 20 and 256 chars) and I need to make 2 dimensional array from it (memory table based on primary key).

Code works as expected, but $table creation takes a long time in PHP7.0.9 or 7.1.0-dev

same proble apears when PDO is used
more extensive info/diagnostics at 
http://stackoverflow.com/questions/38614982/mysqli-fetch-assoc-performance-php5-4-vs-php7-0

with tested dataset PHP5.x performs 10 times faster

Test script:
---------------
$vysledek = mysqli_query ( $conn,"SELECT * FROM `table` WHERE 1");

while($zaznam = mysqli_fetch_assoc ( $vysledek )){
  $table[$zaznam['prikey']] = $zaznam;
}



Patches

mm-01.diff (last revision 2016-10-24 08:53 UTC by dmitry@php.net)

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2016-08-02 16:10 UTC] petr dot hrabal at gmail dot com
-PHP Version: 7.0.9 +PHP Version: 7.0.9, 7.1.0-dev
 [2016-08-02 16:10 UTC] petr dot hrabal at gmail dot com
Expected resaul is that performance should be at least comparable and not dramaticaly worse
 [2016-08-06 15:32 UTC] nikic@php.net
-Status: Open +Status: Verified
 [2016-08-06 15:32 UTC] nikic@php.net
Can confirm this is the case. Some further observations:

* In PHP 7 the time increases quadratically with the number of rows. In PHP 5 it increases linearly.
* PHP 7 goes back to reasonable performance with USE_ZEND_ALLOC=0
* callgrind with --cache-sim suggests that we're seeing a crazy number of LL data read misses in zend_mm_alloc_pages (this line: https://github.com/php/php-src/blob/master/Zend/zend_alloc.c#L856) This being callgrind, I don't know whether this is true

My wild guess: We're going through the list of chunks where the first N are all full until we hit one which isn't. At each chunk we experience an LL miss due to aligned address conflicts.
 [2016-08-06 15:54 UTC] nikic@php.net
-Assigned To: +Assigned To: dmitry
 [2016-08-06 15:54 UTC] nikic@php.net
It's not really necessary to bring mysql into this. Just allocating a bunch of strings without deallocating them in between is enough:

<?php

$n = 50000000;
$s = "aaaaaaaaaaaaaaaaaaaaa";
$a = [];

$t = microtime(true);
for ($i = 0; $i < $n; $i++) {
    $s++;
    $a[] = $s;
}
var_dump(microtime(true) - $t);

This gives me the following results with and without ZMM:

nikic@saturn:~/php-src-fast$ sapi/cli/php -d memory_limit=-1 t043.php 
float(30.064018011093)
nikic@saturn:~/php-src-fast$ USE_ZEND_ALLOC=0 sapi/cli/php -d memory_limit=-1 t043.php 
float(3.7853488922119)

Generally the ZMM is somewhat faster than the system allocator, not 20x slower.

Assigning to dmitry...
 [2016-10-21 18:31 UTC] jim dot hofer at gmail dot com
Just wanted to add this bug also affects Centos 6 & 7 and still present in PHP 7.0.12.

USE_ZEND_ALLOC=0 appears to be a good temporary alternative, but also leads to other weird behavior. Like this bug https://bugs.php.net/bug.php?id=73370
 [2016-10-21 23:21 UTC] nikic@php.net
Could have sworn I posted a more detailed analysis here... I no longer remember the specifics. The issue is that we're doing a naive linear search on the chunk list until we find a chunk with enough free pages. This ends up causing quadratic complexity. I checked how jemalloc handles this, they use an RB-tree to find an appropriate chunk. We should probably do something similar as well -- though it might already help if we just move completely full chunks into a separate list, there's really no point scanning them.
 [2016-10-24 08:53 UTC] dmitry@php.net
The following patch has been added/updated:

Patch Name: mm-01.diff
Revision:   1477299181
URL:        https://bugs.php.net/patch-display.php?bug=72736&patch=mm-01.diff&revision=1477299181
 [2016-10-24 08:56 UTC] dmitry@php.net
The attached patch should fix the problem in PHP-7.0 (without BC breaks).
 [2016-10-26 12:29 UTC] dmitry@php.net
-Status: Verified +Status: Closed
 [2016-10-26 12:29 UTC] dmitry@php.net
The fix for this bug has been committed.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.

 For Windows:

http://windows.php.net/snapshots/
 
Thank you for the report, and for helping us make PHP better.


 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Mar 19 07:01:29 2024 UTC