php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #35258 preg_replace() doesn't seem to release the memory used
Submitted: 2005-11-17 12:22 UTC Modified: 2006-02-07 18:31 UTC
From: mfischer@php.net Assigned:
Status: Closed Package: Documentation problem
PHP Version: 4.4.1 OS: Linux
Private report: No CVE-ID: None
 [2005-11-17 12:22 UTC] mfischer@php.net
Description:
------------
preg_replace() doesn't seem to release the memory used to do its operation.

Simple example using preg_replace():

php -r 'var_dump(memory_get_usage()); $input = str_pad("", 50000, "foo"); var_dump(memory_get_usage()); $input = preg_replace(";foo;", "", $input); var_dump(memory_get_usage());'

int(15216)
int(65320)
int(115352)


Simple example using ereg_replace():

php -r 'var_dump(memory_get_usage()); $input = str_pad("", 50000, "foo"); var_dump(memory_get_usage()); $input = ereg_replace("foo", "", $input); var_dump(memory_get_usage());'

int(15208)
int(65312)
int(15312)

Maybe this is just a documentation issue, but than it should be noted. The documentation says that preg_ is faster, which it is, and the documentation somewhere mentiones a regex cache, but it doesn't say that the amount of memory grows by the size of the input sring.

It's an interesting behaviour which should be considered when memory is an issue.


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-11-17 13:59 UTC] tony2001@php.net
Memory usage doesn't depend on the length of the data.
Try this code:
<?php
var_dump(memory_get_usage()); 
$input = str_pad("", 50000, "foo");
var_dump(memory_get_usage()); 
$input = preg_replace(";foo;", "", $input); 
unset($input); // <------------ !!
var_dump(memory_get_usage());
?>
 [2005-12-13 09:53 UTC] mfischer@php.net
Your answer is interesting. If I take your example and insert another var_dump() before the unset() command, like this:

<?php
        var_dump(memory_get_usage());
        $input = str_pad("", 50000, "foo");
        var_dump(memory_get_usage());
        $input = preg_replace(";foo;", "", $input);
        var_dump(memory_get_usage());
        unset($input); // <------------ !!
        var_dump(memory_get_usage());


I get these numbers:

$ php test.php
int(15496)
int(65600)
int(115632)
int(15624)

However, there is not benefit in unsetting the return value from preg_replace else I could remove the call to it anyway.

But maybe the example was bad. Take for example, I want to replace 'foo' with 'bar':

<?php
        var_dump(memory_get_usage());
        $input = str_pad("", 50000, "foo");
        var_dump(memory_get_usage());
        $input = preg_replace(";foo;", "bar", $input);
        var_dump(memory_get_usage());


$ php test.php
int(15208)
int(65312)
int(115352)

If I want to further work with the value I have no way of saving memory (I can't really unset the value when I need it).

I also don't quite understand the statement "memory usage doesn't depend on the length of the data". The numbers are certainly different when I use small strings.


 [2005-12-13 10:45 UTC] sniper@php.net
This is the same issue as all the other "PHP does not release  memory" reports. The memory reserved is never released during script run. It's released during shutdown.
 [2005-12-13 11:45 UTC] mfischer@php.net
No doubt. However, the ereg* family of functions does not consume memory in the way the preg* family of functions does. So this is actually a documentation problem it seems, re-cateogrizing.
 [2005-12-13 20:01 UTC] nlopess@php.net
The pcre extension API (used by preg_*() functions, the new filter extension,..) cache compiled regexes, but since PHP xx (I don't remember the version.. :) ), that cache has a limit.
So memory usage won't grow indefinetely, it will "just" cache 4096 regexes (by default).

I don't think this should be documented. PHP users don't really want to know about internal stuff.
 [2006-02-05 21:20 UTC] mfischer@php.net
I think PHP users definitely want to now about this. Why should they not want to know? I think it's important to know that the preg_* functions cache the regular expression while ereg don't. Because that means that preg* over ereg* consumes memory in the long term, which, as I i've seen it in real live, became an issue. Since I had this spontanous memory problems I consulted the preg* manual page but nowwhere it wasn't mentioned and I hat to find it out myself which took some time.

Why not shorten other peoples time by documenting it?
 [2006-02-07 18:31 UTC] nlopess@php.net
This bug has been fixed in the documentation's XML sources. Since the
online and downloadable versions of the documentation need some time
to get updated, we would like to ask you to be a bit patient.

Thank you for the report, and for helping us make our documentation better.


 [2020-02-07 06:11 UTC] phpdocbot@php.net
Automatic comment on behalf of nlopess
Revision: http://git.php.net/?p=doc/en.git;a=commit;h=3534833478dcbc7aeaa8e7469c67255dd9ad90df
Log: fix #35258: document the compiled regex cache
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Mon May 06 16:01:33 2024 UTC