php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #32806 fgets() function is slow
Submitted: 2005-04-23 23:11 UTC Modified: 2005-11-11 01:00 UTC
Votes:16
Avg. Score:4.4 ± 0.9
Reproduced:12 of 12 (100.0%)
Same Version:3 (25.0%)
Same OS:9 (75.0%)
From: peoned at yahoo dot com Assigned: wez (profile)
Status: No Feedback Package: Performance problem
PHP Version: 5.0.4 OS: Linux
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: peoned at yahoo dot com
New email:
PHP Version: OS:

 

 [2005-04-23 23:11 UTC] peoned at yahoo dot com
Description:
------------
fgets() is too slow in PHP. It is a lot slower than in Perl or C, the languages I compared it to. I read and wrote out a 20 MB file in PHP, Perl, and C (all by line). Here are my results
C: 0.938s, 0.949s, 0.945s, 0.943s
Perl: 4.946s, 2.123s, 2.119s, 2.158s
php: 15.606s,11.637s, 11.675s, 11.260s

I ran tests on 2 computers, with fairly similar results. And another person from a forum who I asked about fgets() ran it with approximately 6 seconds in Windows and 7 in Linux for a 15 MB file.

Replacing fgets() with fread($fin, 1024) gives these results
0.835s,0.797s,0.812s,0.836s

So the problem is with fgets(). Perl is slower than C because C is compiled and Perl is interpreted. But there isn't a reason why php should be that much slower than Perl. And fgets() should be slower than fread() but not by that much.
 

Reproduce code:
---------------
    parse_p("in.txt", "out.txt");


    function parse_p($in_file, $out_file)
    {
        $fin = fopen($in_file, "rb");
        $fout = fopen($out_file, "wb");

        while(!feof($fin))
        {
            $line = fgets($fin);
            fwrite($fout, $line);
        }
        fclose($fin);
        fclose($fout);
    }

Expected result:
----------------
I expect it to be comparable to Perl or C in speed

Actual result:
--------------
It was much slower than Perl or C

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2005-04-23 23:59 UTC] sniper@php.net
Can you provide the perl code you used..?
Also, if you have magic_quotes_runtime ini option set to off, it'll be faster.

 [2005-04-24 00:13 UTC] peoned at yahoo dot com
Perl code:

#!/usr/local/bin/perl

open(IN, "<afe199406");
open(OUT, ">perl_out.txt");

while(<IN>)
{
    print OUT $_;
}

close(OUT);
close(IN);


C code:

#include <stdio.h>
#include <string.h>

int main(void)
{

    size_t n = 5000;
    char *ptr;

    FILE *fp;
    FILE *fi;
    fp = fopen("int.txt", "rb");
    fi = fopen("out.txt", "wb");

    ptr = (char *)malloc(5000);

    int i=0;
    while(getline(&ptr,&n,fp) != -1)
    {
        fwrite(ptr, 1, strlen(ptr), fi);
    }

    free(ptr);

    fclose(fp);
    fclose(fi);
}


magic_quotes_runtime was off
 [2005-04-24 18:25 UTC] iliaa@php.net
Performance is equivalent when you simplify your PHP script and stop timing php's start-up costs. Also make sure that automatic detection of new lines is disabled.
 [2005-04-24 20:20 UTC] peoned at yahoo dot com
I don't agree that this is a bogus bug. You want to tell me that the start up cost is responsible for 8 seconds? Than you have a performance bug with your start up cost. Run it with fread($fin, 1024); where did the start up cost go? Simplifying the script to while($line = fgets($fin)){} doesn't help either. And automatic detection of new lines is disabled. Did you run some of your own tests?
 [2005-04-28 01:58 UTC] iliaa@php.net
Startup costs could be 8 seconds or more depending on the extensions you are loading. That said PHP's fgets() is still slower then Perl's because of it's implementation that does not wrap the C library fgets() or getline(), but rather uses custom code.
 [2005-04-28 04:37 UTC] wez@php.net
How long are the lines in your file?
 [2005-04-28 07:30 UTC] peoned at yahoo dot com
I did an 
`echo strlen($line).",";` 
Here are the lengths for some of the first few lines:
42,7,18,11,71,13,8,7,42,7,20,11,73,70,68,63,13,8,7

It looks pretty much like this for the rest of the file. Lines between 1 and 100 chars in length.

Note: It isn't the start up cost for sure because I measured the time just around the while loop with the same results.
 [2005-05-02 14:25 UTC] wez@php.net
Can you try your tests with the fwrite() line commented out?
 [2005-05-04 23:24 UTC] peoned at yahoo dot com
without fwrite() it is significantly faster, gives about 5-6 seconds. But in Perl without print OUT $_; it gives around 1-2 seconds. So it is still slower in PHP because of fgets().
 [2005-11-03 22:45 UTC] sniper@php.net
Please try using this CVS snapshot:

  http://snaps.php.net/php5-latest.tar.gz
 
For Windows:
 
  http://snaps.php.net/win32/php5-win32-latest.zip


 [2005-11-11 01:00 UTC] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
 [2007-05-10 18:43 UTC] scottij at arbor dot net
I'm getting similar behavior with php 5.2.1.

I am reading about 5,000 lines over a tcp socket to another program on the localhost (NOT over a network).  Initially, it takes about 500 us per line of text (where the lines are < 20 characters long).  Here is the code snippet and output:

while (/* some eof and timeout checks here */) {
    $rstart = microtime(TRUE);
    $str = fgets($this->dataSocket, 8096);
    $rstop = microtime(TRUE);
    $rdiff = $rstop - $rstart;
    print("read took $rdiff secs.<br>\n");
    print("str = $str<br>\n");
}

This yields, for example:

str = 16549|Item2250||3|2|
read took 0.00049185752868652 secs.
str = 16550|Item2251||3|2| 
read took 0.00049495697021484 secs.
str = 16551|Blob2252||3|2| 
read took 0.00049018859863281 secs.

I run that over a full dump of my text (5000 lines).

If I then close the socket, open a new one, and do the same operation, about halfway through the fgets() times start increasing dramatically:

str = 16645|Item2346||3|2| 
read took 0.0019731521606445 secs.
str = 16646|Item2347||3|2| 
read took 0.0019690990447998 secs.
str = 16647|Item2348||3|2| 
read took 0.0020229816436768 secs.

2 ms to read each short line??  Again, this is just over a local socket, nothing over the network.

This is reproducible every time on my system.

Thanks.
 
PHP Copyright © 2001-2022 The PHP Group
All rights reserved.
Last updated: Tue Oct 04 04:05:53 2022 UTC