php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #32806 fgets() function is slow
Submitted: 2005-04-23 23:11 UTC Modified: 2005-11-11 01:00 UTC
Votes:16
Avg. Score:4.4 ± 0.9
Reproduced:12 of 12 (100.0%)
Same Version:3 (25.0%)
Same OS:9 (75.0%)
From: peoned at yahoo dot com Assigned: wez (profile)
Status: No Feedback Package: Performance problem
PHP Version: 5.0.4 OS: Linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: peoned at yahoo dot com
New email:
PHP Version: OS:

 

 [2005-04-23 23:11 UTC] peoned at yahoo dot com
Description:
------------
fgets() is too slow in PHP. It is a lot slower than in Perl or C, the languages I compared it to. I read and wrote out a 20 MB file in PHP, Perl, and C (all by line). Here are my results
C: 0.938s, 0.949s, 0.945s, 0.943s
Perl: 4.946s, 2.123s, 2.119s, 2.158s
php: 15.606s,11.637s, 11.675s, 11.260s

I ran tests on 2 computers, with fairly similar results. And another person from a forum who I asked about fgets() ran it with approximately 6 seconds in Windows and 7 in Linux for a 15 MB file.

Replacing fgets() with fread($fin, 1024) gives these results
0.835s,0.797s,0.812s,0.836s

So the problem is with fgets(). Perl is slower than C because C is compiled and Perl is interpreted. But there isn't a reason why php should be that much slower than Perl. And fgets() should be slower than fread() but not by that much.
 

Reproduce code:
---------------
    parse_p("in.txt", "out.txt");


    function parse_p($in_file, $out_file)
    {
        $fin = fopen($in_file, "rb");
        $fout = fopen($out_file, "wb");

        while(!feof($fin))
        {
            $line = fgets($fin);
            fwrite($fout, $line);
        }
        fclose($fin);
        fclose($fout);
    }

Expected result:
----------------
I expect it to be comparable to Perl or C in speed

Actual result:
--------------
It was much slower than Perl or C

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-04-23 23:59 UTC] sniper@php.net
Can you provide the perl code you used..?
Also, if you have magic_quotes_runtime ini option set to off, it'll be faster.

 [2005-04-24 00:13 UTC] peoned at yahoo dot com
Perl code:

#!/usr/local/bin/perl

open(IN, "<afe199406");
open(OUT, ">perl_out.txt");

while(<IN>)
{
    print OUT $_;
}

close(OUT);
close(IN);


C code:

#include <stdio.h>
#include <string.h>

int main(void)
{

    size_t n = 5000;
    char *ptr;

    FILE *fp;
    FILE *fi;
    fp = fopen("int.txt", "rb");
    fi = fopen("out.txt", "wb");

    ptr = (char *)malloc(5000);

    int i=0;
    while(getline(&ptr,&n,fp) != -1)
    {
        fwrite(ptr, 1, strlen(ptr), fi);
    }

    free(ptr);

    fclose(fp);
    fclose(fi);
}


magic_quotes_runtime was off
 [2005-04-24 18:25 UTC] iliaa@php.net
Performance is equivalent when you simplify your PHP script and stop timing php's start-up costs. Also make sure that automatic detection of new lines is disabled.
 [2005-04-24 20:20 UTC] peoned at yahoo dot com
I don't agree that this is a bogus bug. You want to tell me that the start up cost is responsible for 8 seconds? Than you have a performance bug with your start up cost. Run it with fread($fin, 1024); where did the start up cost go? Simplifying the script to while($line = fgets($fin)){} doesn't help either. And automatic detection of new lines is disabled. Did you run some of your own tests?
 [2005-04-28 01:58 UTC] iliaa@php.net
Startup costs could be 8 seconds or more depending on the extensions you are loading. That said PHP's fgets() is still slower then Perl's because of it's implementation that does not wrap the C library fgets() or getline(), but rather uses custom code.
 [2005-04-28 04:37 UTC] wez@php.net
How long are the lines in your file?
 [2005-04-28 07:30 UTC] peoned at yahoo dot com
I did an 
`echo strlen($line).",";` 
Here are the lengths for some of the first few lines:
42,7,18,11,71,13,8,7,42,7,20,11,73,70,68,63,13,8,7

It looks pretty much like this for the rest of the file. Lines between 1 and 100 chars in length.

Note: It isn't the start up cost for sure because I measured the time just around the while loop with the same results.
 [2005-05-02 14:25 UTC] wez@php.net
Can you try your tests with the fwrite() line commented out?
 [2005-05-04 23:24 UTC] peoned at yahoo dot com
without fwrite() it is significantly faster, gives about 5-6 seconds. But in Perl without print OUT $_; it gives around 1-2 seconds. So it is still slower in PHP because of fgets().
 [2005-11-03 22:45 UTC] sniper@php.net
Please try using this CVS snapshot:

  http://snaps.php.net/php5-latest.tar.gz
 
For Windows:
 
  http://snaps.php.net/win32/php5-win32-latest.zip


 [2005-11-11 01:00 UTC] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
 [2007-05-10 18:43 UTC] scottij at arbor dot net
I'm getting similar behavior with php 5.2.1.

I am reading about 5,000 lines over a tcp socket to another program on the localhost (NOT over a network).  Initially, it takes about 500 us per line of text (where the lines are < 20 characters long).  Here is the code snippet and output:

while (/* some eof and timeout checks here */) {
    $rstart = microtime(TRUE);
    $str = fgets($this->dataSocket, 8096);
    $rstop = microtime(TRUE);
    $rdiff = $rstop - $rstart;
    print("read took $rdiff secs.<br>\n");
    print("str = $str<br>\n");
}

This yields, for example:

str = 16549|Item2250||3|2|
read took 0.00049185752868652 secs.
str = 16550|Item2251||3|2| 
read took 0.00049495697021484 secs.
str = 16551|Blob2252||3|2| 
read took 0.00049018859863281 secs.

I run that over a full dump of my text (5000 lines).

If I then close the socket, open a new one, and do the same operation, about halfway through the fgets() times start increasing dramatically:

str = 16645|Item2346||3|2| 
read took 0.0019731521606445 secs.
str = 16646|Item2347||3|2| 
read took 0.0019690990447998 secs.
str = 16647|Item2348||3|2| 
read took 0.0020229816436768 secs.

2 ms to read each short line??  Again, this is just over a local socket, nothing over the network.

This is reproducible every time on my system.

Thanks.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Dec 27 11:01:30 2024 UTC