php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #11018 ereg_replace takes a preposterous amount of time w.r.t. PHP3
Submitted: 2001-05-22 09:45 UTC Modified: 2001-06-04 04:54 UTC
From: vigna at dsi dot unimi dot it Assigned:
Status: Closed Package: Performance problem
PHP Version: 4.0.4pl1 OS: Linux
Private report: No CVE-ID: None
 [2001-05-22 09:45 UTC] vigna at dsi dot unimi dot it
My publication page
(http://gongolo.usr.dsi.unimi.it/~vigna/papers) is generated
with PHP. Part of the generation process includes the
following regular expression substitutions:

$bib = ereg_replace("([0-9]+)--([0-9]+)", "\\1-\\2",$bib); 
$bib = ereg_replace("\{((['`\]|\w)+)\}", "\\1", $bib);

$bib is a variable containig about 20K of text generated by
a BibTeX style. Everything worked fine with PHP 3, but since
I installed PHP 4, the page is output in >20 seconds.
Indeed, there where other 10 substitutions, and initially
the page wouldn't simply display because of server timeout.
I changed whenever possible ereg_replace to str_replace, but
of course the two substitutions above need regular
expressions. They increase the page serving time of about
15s. Computation time was negligible with PHP 3.

The strange thing is that the page is _serverd_ slowly. One
would expect that there is a long wait, and then the page is
served all at one time. Instead, there is a long wait, and
then the page is served slowly. Eliminating the two regular
expressions above solves the problem (but serves the wrong
page 8^).

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2001-05-22 09:50 UTC] derick@php.net
I suggest using preg_replace and try to splitup the $bib variabele into smaller parts (i.e. per line).
Can you see if this works better for you?

Derick
 [2001-05-22 10:02 UTC] brianlmoon@php.net
What is that second regex doing?  It looks like there is a missing \ in there.

Just for testing, have you tried using preg_replace instead of ereg.  The preg functions are faster and more reliable.
 [2001-05-22 10:12 UTC] vigna at dsi dot unimi dot it
> What is that second regex doing?  It looks like there is 
missing \\ in there.

There are backslashes missing everywhere, apparently eaten
by the submission form.

I tried with preg_replace() and everything works fine now.

However I would  like to hear an official comment about
this: if this is a "normal" status for the ereg_* family,
I'll stop using it. Performance is simply unacceptable.

I am a bit uncomfortable about using Perl stuff. POSIX is a
standard, Perl is not. But if the word is "drop ereg", it's
OK anyway.
 [2001-06-04 04:54 UTC] sniper@php.net
This is the normal status and it most likely won't change 
in the future, ereg_*() are slower that preg_*() functions.

--Jani

 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 19 21:01:30 2024 UTC