php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #48643 String functions memory issue (was:Memory not freed with FilterIterator)
Submitted: 2009-06-22 11:41 UTC Modified: 2009-06-24 08:54 UTC
From: zoe@php.net Assigned: dmitry (profile)
Status: Closed Package: Scripting Engine problem
PHP Version: 5.2CVS-2009-06-22 (CVS) OS: Linux (Ubuntu)
Private report: No CVE-ID: None
 [2009-06-22 11:41 UTC] zoe@php.net
Description:
------------
An excessive about of memory is used when a class which extends FilterIterator is used to filter for certain file types when scanning a source tree.



Reproduce code:
---------------
A small benchmark is provided in the tar file here:http://filebin.ca/okgvtt/memcheck.tar

The file contains three tests which all use code in Util.php.
To run the tests edit the shell script memcheck to add the top level directory of a PHP source tree as input to each test. Then just execute the shell script.

The benchmarks produce a list of directories which contain .phpt files, the memory usage when the class PhptFilterIterator is used is 20 times higher than the other two methods.

Expected result:
----------------
Would expect the memory usage to be similar.

Actual result:
--------------
Memory usage is 20 times higher.

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-06-22 12:32 UTC] zoe@php.net
By the way, I traced the memory consumption using xdebug:

xdebug.auto_trace = on;
xdebug.show_mem_delta = on;

The first few lines are here http://pastebin.ca/1469804

This makes it look more like a bug with the way the accept() method is invoked inside PhptFilterIterator.
 [2009-06-22 14:33 UTC] zoe@php.net
I tested with PHP 5.2 but do not see the same issue. I believe this is specific to PHP 5.3.
 [2009-06-23 12:13 UTC] robin_fernandes at uk dot ibm dot com
Below is a simplified testcase which I think exposes the same leak.
'/manyFiles' is a directory containing 10000 files.

<?php
$rii = new RecursiveIteratorIterator(
          new RecursiveDirectoryIterator('/manyFiles')
       );

function noop() {}

echo "noop call in loop (no leak)       : ";
foreach ($rii as $v) { noop($v); }
echo memory_get_usage() . PHP_EOL;

echo "strlen call in loop (leak on 53)  : ";
foreach ($rii as $v) { strlen($v); }
echo memory_get_usage() . PHP_EOL;
?>


Output on php52:
noop call in loop (no leak)       : 66176
strlen call in loop (leak on 53)  : 66176

Output on php53:
noop call in loop (no leak)       : 337448
strlen call in loop (leak on 53)  : 6028496
 [2009-06-23 13:02 UTC] zoe@php.net
Looking backwards through PHP builds I have currently narrowed this down to a change that went in to PHP 5.3 somewhere between the 12th June 2008 and the 1st July 2008.
 [2009-06-23 14:13 UTC] robin_fernandes at uk dot ibm dot com
Testcase below shows that the issue relates to an implicit cast to string on an SplFileInfo object when retrieved from a RecursiveDirectoryIterator during iteration.

<?php
$rdi = new RecursiveDirectoryIterator('/manyFiles');

echo "SplFileInfo explicit cast to string: ";
foreach ($rdi as $v) { strlen((string)$v); } //OK
echo memory_get_usage(true) . PHP_EOL;

echo "SplFileInfo implicit cast to string: ";
foreach ($rdi as $v) { strlen($v); } //Leaky
echo memory_get_usage(true) . PHP_EOL;
?>

php52:
SplFileInfo explicit cast to string: 262144
SplFileInfo implicit cast to string: 262144

php53:
SplFileInfo explicit cast to string: 524288
SplFileInfo implicit cast to string: 6291456
 [2009-06-23 15:39 UTC] zoe@php.net
A checkout of PHP53 from the 25th June 2008 does *not* have the problem

A checkout of PHP53 from the 26th June 2008 *does* have the problem
 [2009-06-23 16:43 UTC] zoe@php.net
Here are the files that changed between those dates (I have taken out things that look irrelevant like test files)

RCS file: /repository/ZendEngine2/zend_builtin_functions.c,v
date: 2008/06/25 22:37:14;  author: felipe;  state: Exp;  lines: +2 -2
date: 2008/06/25 22:35:31;  author: felipe;  state: Exp;  lines: +1 -2

RCS file: /repository/php-src/ext/reflection/php_reflection.c,v
date: 2008/06/25 12:34:14;  author: dmitry;  state: Exp;  lines: +152 -2
date: 2008/06/25 12:33:46;  author: dmitry;  state: Exp;  lines: +128 -2

RCS file: /repository/php-src/ext/standard/formatted_print.c,v
date: 2008/06/25 10:16:52;  author: davidc;  state: Exp;  lines: +21 -25
date: 2008/06/25 08:56:42;  author: davidc;  state: Exp;  lines: +12 -4

RCS file: /repository/php-src/ext/standard/string.c,v
date: 2008/06/25 12:16:16;  author: ohill;  state: Exp;  lines: +251 -328

RCS file: /repository/php-src/main/main.c,v
date: 2008/06/25 12:18:51;  author: dmitry;  state: Exp;  lines: +3 -1
date: 2008/06/25 12:18:21;  author: dmitry;  state: Exp;  lines: +3 -1
date: 2008/06/25 12:18:36;  author: dmitry;  state: Exp;  lines: +3 -1

RCS file: /repository/php-src/main/php_ticks.c,v
date: 2008/06/25 12:18:51;  author: dmitry;  state: Exp;  lines: +6 -1
date: 2008/06/25 12:18:22;  author: dmitry;  state: Exp;  lines: +6 -1
date: 2008/06/25 12:18:36;  author: dmitry;  state: Exp;  lines: +6 -1

RCS file: /repository/php-src/main/php_ticks.h,v
date: 2008/06/25 12:18:51;  author: dmitry;  state: Exp;  lines: +2 -1
date: 2008/06/25 12:18:22;  author: dmitry;  state: Exp;  lines: +2 -1


The most likely candidates seem to be string.c and builtin_functions and it looks as though it is changes to parameter parsing that cause the problem. At this point I think we have done as much as we can to narrow this down. I'm changing this to Scripting engine as it clearly isn't an SPL issue.
 [2009-06-23 19:56 UTC] zoe@php.net
Hi dmitry - please would you look at this?
 [2009-06-23 21:35 UTC] johannes@php.net
The following patch fixes the issue. Might not be the nicest way and probably some other code in there might need a similar fix.

Index: Zend/zend_API.c
===================================================================
RCS file: /repository/ZendEngine2/zend_API.c,v
retrieving revision 1.296.2.27.2.34.2.64
diff -u -p -r1.296.2.27.2.34.2.64 zend_API.c
--- Zend/zend_API.c     4 Jun 2009 18:20:42 -0000       1.296.2.27.2.34.2.64
+++ Zend/zend_API.c     23 Jun 2009 21:33:04 -0000
@@ -254,10 +254,13 @@ ZEND_API int zend_get_object_classname(c
 static int parse_arg_object_to_string(zval **arg, char **p, int *pl, int type TSRMLS_DC) /* {{{ */
 {
        if (Z_OBJ_HANDLER_PP(arg, cast_object)) {
+               zval tmp;
+               INIT_PZVAL(&tmp);
                SEPARATE_ZVAL_IF_NOT_REF(arg);
-               if (Z_OBJ_HANDLER_PP(arg, cast_object)(*arg, *arg, type TSRMLS_CC) == SUCCESS) {
-                       *pl = Z_STRLEN_PP(arg);
-                       *p = Z_STRVAL_PP(arg);
+               if (Z_OBJ_HANDLER_PP(arg, cast_object)(*arg, &tmp, type TSRMLS_CC) == SUCCESS) {
+                       *pl = Z_STRLEN(tmp);
+                       *p = Z_STRVAL(tmp);
+                       zval_dtor(&tmp);
                        return SUCCESS;
                }
        }

 [2009-06-24 08:54 UTC] dmitry@php.net
This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.


 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Dec 03 17:01:29 2024 UTC