php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #72736 Slow performance when fetching large dataset with mysqli / PDO
Submitted: 2016-08-02 14:55 UTC Modified: 2016-10-26 12:29 UTC
Votes:2
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:1 (100.0%)
From: petr dot hrabal at gmail dot com Assigned: dmitry (profile)
Status: Closed Package: *Database Functions
PHP Version: 7.0.9, 7.1.0-dev OS: Debian 8
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: petr dot hrabal at gmail dot com
New email:
PHP Version: OS:

 

 [2016-08-02 14:55 UTC] petr dot hrabal at gmail dot com
Description:
------------
I have large MySQL query (1.8M rows, 25 columns -  all string between 20 and 256 chars) and I need to make 2 dimensional array from it (memory table based on primary key).

Code works as expected, but $table creation takes a long time in PHP7.0.9 or 7.1.0-dev

same proble apears when PDO is used
more extensive info/diagnostics at 
http://stackoverflow.com/questions/38614982/mysqli-fetch-assoc-performance-php5-4-vs-php7-0

with tested dataset PHP5.x performs 10 times faster

Test script:
---------------
$vysledek = mysqli_query ( $conn,"SELECT * FROM `table` WHERE 1");

while($zaznam = mysqli_fetch_assoc ( $vysledek )){
  $table[$zaznam['prikey']] = $zaznam;
}



Patches

mm-01.diff (last revision 2016-10-24 08:53 UTC by dmitry@php.net)

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2016-08-02 16:10 UTC] petr dot hrabal at gmail dot com
-PHP Version: 7.0.9 +PHP Version: 7.0.9, 7.1.0-dev
 [2016-08-02 16:10 UTC] petr dot hrabal at gmail dot com
Expected resaul is that performance should be at least comparable and not dramaticaly worse
 [2016-08-06 15:32 UTC] nikic@php.net
-Status: Open +Status: Verified
 [2016-08-06 15:32 UTC] nikic@php.net
Can confirm this is the case. Some further observations:

* In PHP 7 the time increases quadratically with the number of rows. In PHP 5 it increases linearly.
* PHP 7 goes back to reasonable performance with USE_ZEND_ALLOC=0
* callgrind with --cache-sim suggests that we're seeing a crazy number of LL data read misses in zend_mm_alloc_pages (this line: https://github.com/php/php-src/blob/master/Zend/zend_alloc.c#L856) This being callgrind, I don't know whether this is true

My wild guess: We're going through the list of chunks where the first N are all full until we hit one which isn't. At each chunk we experience an LL miss due to aligned address conflicts.
 [2016-08-06 15:54 UTC] nikic@php.net
-Assigned To: +Assigned To: dmitry
 [2016-08-06 15:54 UTC] nikic@php.net
It's not really necessary to bring mysql into this. Just allocating a bunch of strings without deallocating them in between is enough:

<?php

$n = 50000000;
$s = "aaaaaaaaaaaaaaaaaaaaa";
$a = [];

$t = microtime(true);
for ($i = 0; $i < $n; $i++) {
    $s++;
    $a[] = $s;
}
var_dump(microtime(true) - $t);

This gives me the following results with and without ZMM:

nikic@saturn:~/php-src-fast$ sapi/cli/php -d memory_limit=-1 t043.php 
float(30.064018011093)
nikic@saturn:~/php-src-fast$ USE_ZEND_ALLOC=0 sapi/cli/php -d memory_limit=-1 t043.php 
float(3.7853488922119)

Generally the ZMM is somewhat faster than the system allocator, not 20x slower.

Assigning to dmitry...
 [2016-10-21 18:31 UTC] jim dot hofer at gmail dot com
Just wanted to add this bug also affects Centos 6 & 7 and still present in PHP 7.0.12.

USE_ZEND_ALLOC=0 appears to be a good temporary alternative, but also leads to other weird behavior. Like this bug https://bugs.php.net/bug.php?id=73370
 [2016-10-21 23:21 UTC] nikic@php.net
Could have sworn I posted a more detailed analysis here... I no longer remember the specifics. The issue is that we're doing a naive linear search on the chunk list until we find a chunk with enough free pages. This ends up causing quadratic complexity. I checked how jemalloc handles this, they use an RB-tree to find an appropriate chunk. We should probably do something similar as well -- though it might already help if we just move completely full chunks into a separate list, there's really no point scanning them.
 [2016-10-24 08:53 UTC] dmitry@php.net
The following patch has been added/updated:

Patch Name: mm-01.diff
Revision:   1477299181
URL:        https://bugs.php.net/patch-display.php?bug=72736&patch=mm-01.diff&revision=1477299181
 [2016-10-24 08:56 UTC] dmitry@php.net
The attached patch should fix the problem in PHP-7.0 (without BC breaks).
 [2016-10-26 12:29 UTC] dmitry@php.net
-Status: Verified +Status: Closed
 [2016-10-26 12:29 UTC] dmitry@php.net
The fix for this bug has been committed.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.

 For Windows:

http://windows.php.net/snapshots/
 
Thank you for the report, and for helping us make PHP better.


 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 13:01:29 2024 UTC