|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2007-01-28 00:30 UTC] thuejk at gmail dot com
Description:
------------
I have some code which produces unacceptible performance in a specific situation. Making a completely trivial change improves performance a hundred-fold or more.
The problem is 100% reproducible.
The code looks something like:
/* pseudocode*/
function get_data_from_pgsql(){
...
$map = Array();
foreach ($rows as $row) {
$map = row[index];
}
return $map;
}
$data1 = get_data_from_pgsql
$data2 = get_data_from_pgsql
One run with 10000 rows of result took 8.77 seconds, which is clearly silly. Making the extremely trivial change of moving the code block
foreach ($rows as $row) {
$map = row[index];
}
out of the function get_data_from_pgsql(), the code suddently only takes 0.1 seconds to run (factor 90)! Having more rows in the result makes the factor difference larger; it seems to increase quadratically.
Not saving the return values in $data1 and $data2 also improves performance immensely.
PHP 5.1.5 did not have this problem (or at least it was much smaller).
Reproduce code:
---------------
http://thuejk.dk/test.php.txt
You need a running postgresql database with one table, which has at least 10000 entries.
change the line
$translate_outside_function = false;
at the top of the file to see the difference mentioned above.
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Tue Oct 28 22:00:01 2025 UTC |
Ok, after a good deal of work I actually tracked down the problem to fragmentation in your memory management. This understanding enabled me to construct a simpler test case. Note that the problem is unrelated to database function, and this new testcase does not require any database. <?php //According to my calculations, setting $num=18000000 should make this script take ~one year. $num = 100000; $a = Array(); for ($i=0; $i<$num; $i++) { $a[$i] = Array(1); } /* The underlying memory now looks something like * row structure * row value * row structure * ... */ $b = Array(); for ($i=0; $i<$num; $i++) { $b[$i] = $a[$i][0]; } /* The b rows are allocated * * Though I haven't checked it in the code, I have reason to believe * that the values inserted into the b rows are pointers to the same * memory allocated for the values in the a rows */ unset($a); /* The a rows are unallocated, but the values are still references from $b, so the memory map looks something like * row value * free_block * row_value * free_block * ... * * repeated $num times. */ /* Now, for each memory allocation for a row, PHP runs through all * free blocks checking for a best fit. Since there are $num free * blocks, this takes time. This is done by the function * _zend_mm_alloc_int in PHP 5.2.0 Zend/zend_alloc.c : * * end = &heap->free_buckets[0]; * for (p = end->next_free_block; p != end; p = p->next_free_block) { * size_t s = ZEND_MM_FREE_BLOCK_SIZE(p); * if (s > true_size) { * if (s < best_size) { // better fit * best_fit = p; * best_size = s; * } * } else if (s == true_size) { * // Found "big" free block of exactly the same size * best_fit = p; * goto zend_mm_finished_searching_for_block; * } * } */ $c = Array(); for ($i=0; $i<$num; $i++) { $c[$i] = 1; } ?>The latest PHP snapshot does fix my example, and probably makes my production code work. This will probably fix the problem in far most cases. But it is possible to construct an example which still have the problematic behavior. One example is below. <?php $num = 100000; $a = Array(); for ($i=0; $i<$num; $i++) { $a[$i] = Array(1); } for ($i=0; $i<$num; $i++) { $b[$i] = $a[$i][0]; } unset($a); for ($i=0; $i<$num; $i++) { $b[$i] = "1234567890123456789012345678901234567890123456789012345678901234567\ 8901234567890123456789012345678901234567890123456789012345678901234567890123456\ 7890123456789012345678901234567890123456789012345678901234567890123456789012345\ 6789012345678901234567890123456789012345678901234567890123456789012345678901234\ 5678901234567890"; } ?>I think I am hitting this in practice, or something like it. I have a function call <? function lala() { [database access with lots of data] echo "returning " . time(); return; } lala(); echo "returned ".time(); ?> And I can see that for some reason, the time between "returning" and "returned" is 60 seconds! This only happens the first time this function is called, for some reason. Installing php 5.1.6 it returns instantaneously. I liked thet PHP 5.1 memory allocator better :(. The PHP 5.1 memory allocator was also 1/4 the size of the PHP 5.2 memory allocator.My version is 5.2.1 the lastest with mandake limited edition 2005 and fedora core 4 The probleme exist always with php 5.2.1 !!!!!!!!!!! The latest release ! Probleme looks like : function test(&$b){ //sql Query... //$a_result_mysql //No problem with performance with php 5.2.1 (6 secondes) foreach($a_result_mysql as $i=> $a_row) { $b[artist][item][$i] = $a_row; } //Very very very very very very very very very slow with php 5.2 (180 secondes) //normal with php < 5.2 (10 secondes) foreach($a_result_mysql as $i=> $a_row) { $b[artist][item][$i][name] = $a_row[name]; $b[artist][item][$i][firstname] = $a_row[firstname]; $b[artist][item][$i][address] = $a_row[address]; $b[artist][item][$i][zip] = $a_row[name]; $b[artist][item][$i][color] = $a_row[color]; } } test($b);Real example ($a_output is a reference with 3000 results from mysql) The probleme processing with memory is after these differents solutions Time for processing Solution 1 or 2 is not verry different. The real probleme is after Solution 1 or 2 for all other processing with memory ! $a_input[idlg] = $a_context[idlg]; node::select_node_no_finalsite($a_input,$a_output); foreach ($a_output as $i => $a_row) { /* //Solution 1 : KO because verry verry verry SLOW $a_data[a_formulaire][listenode2][$i][idn] = $a_row[idn]; $a_data[a_formulaire][listenode2][$i][traduction_idn] = $a_row[traduction_idn]; $a_data[a_formulaire][listenode2][$i][traduction_idcat]= $a_row[traduction_idcat]; $a_data[a_formulaire][listenode2][$i][idte_idn] = $a_row[idte_idn]; $a_data[a_formulaire][listenode2][$i][idte_idcat] = $a_row[idte_idcat]; $a_data[a_formulaire][listenode2][$i][idcat] = $a_row[idcat]; $a_data[a_formulaire][listenode2][$i][name] = $a_row[name]; $a_data[a_formulaire][listenode2][$i][idsite] = $a_row[idsite]; */ //Solution 2 : OK :-) 15 milliseconds $a_data[a_formulaire][listenode2] = $a_row; } //Times here depends choice at top lines !!! //Times is 6518 milliseconds if solution 1 KO :-( //Times is 16 milliseconds if solution is 2 OK $instance_time=new time; $instance_time->execute("test","start"); for($i=0;$i<50000;$i++){ $a .= "k"; } //print $a; $instance_time->execute("test","stop"); print $instance_time->show(); Verry stange behavior for only 50000 concats !!!!!!!!!!! It's not normal because no probleme before php < 5.2