php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #81551 SplFileObject::fgets( does not advance line pointer after SplFileObject::seek()
Submitted: 2021-10-21 19:34 UTC Modified: 2021-10-26 09:38 UTC
Votes:6
Avg. Score:4.5 ± 0.8
Reproduced:5 of 5 (100.0%)
Same Version:5 (100.0%)
Same OS:4 (80.0%)
From: lucasfbustamante at gmail dot com Assigned: cmb (profile)
Status: Not a bug Package: SPL related
PHP Version: 8.0.12 OS: Unix
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: lucasfbustamante at gmail dot com
New email:
PHP Version: OS:

 

 [2021-10-21 19:34 UTC] lucasfbustamante at gmail dot com
Description:
------------
While fixing https://bugs.php.net/bug.php?id=62004, PHP 8.0.1 probably introduced a bug, where after using `SplFileObject::seek($line)`, the first subsequent call to `SplFileObject::fgets()` does not increase the line pointer.

Test script:
---------------
$file = new \SplTempFileObject();

for ($i = 0; $i < 100; $i++) {
    $file->fwrite("Foo\n");
}

$file->seek(50);

var_dump($file->key());
var_dump($file->fgets());
var_dump($file->key());
var_dump($file->fgets());
var_dump($file->key());

// https://3v4l.org/bX3E0

Expected result:
----------------
I expect that using `SplFileObject::fgets()` will consistently increment the line pointer by one.

Actual result:
--------------
The first call to `SplFileObject::fgets()` after calling `SplFileObject::seek($line)` will not increment the lint pointer. Subsequent calls will.

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2021-10-21 20:35 UTC] cmb@php.net
-Assigned To: +Assigned To: cmb
 [2021-10-21 20:37 UTC] requinix@php.net
-Status: Assigned +Status: Feedback -Package: Filesystem function related +Package: SPL related -Assigned To: cmb +Assigned To:
 [2021-10-21 20:37 UTC] requinix@php.net
Looks correct to me. Change the lines to get a better picture of what's happening.
https://3v4l.org/33lsI

key() returns the current line number, which should be the same at the "beginning" of the line (eg, after a seek or rewind) as it is at the "end" of the line (after an fgets). The line number increments according to iterator semantics, which say that the key/current doesn't change until a call to next() - or, in this case, if you try to continue reading beyond the current location using another call to fgets.

Before the bug fix:
- Rewinding returns key=0, fgets=line 0, key=0
- Seeking to 50 would return key=50, fgets=line 51, key=51

After the bug fix:
- Rewinding returns key=0, fgets=line 0, key=0
- Seeking to 50 returns key=50, fgets=line 50, key=50
 [2021-10-21 20:38 UTC] requinix@php.net
-Assigned To: +Assigned To: cmb
 [2021-10-21 21:45 UTC] lucasfbustamante at gmail dot com
I understand what's happening now, and the reasoning behind the change.

However, this makes it very difficult for distributed libraries to have a comprehensive PHP version constraint. If you use the SPLFileObject and depend on `seek()` and `key()`, you either have to pick PHP 8 or PHP 5~7, because they have the same API but operates differently.

Taking the WordPress environment for instance (I am a WordPress plugin developer, my distributed library is a WordPress plugin), PHP 8.0 is used by only 1.5% of the websites to this date: https://wordpress.org/about/stats/

I dislike the fact that PHP broke backwards compatibility for a function that exists and returns this same result since PHP 5.3, especially considering that this was introduced in a minor patch version bump, so PHP 8.0.0 will operate in a substantially different way than, say, PHP 8.0.1 or 8.1.
 [2021-10-21 22:21 UTC] lucasfbustamante at gmail dot com
I see the consistency this bugfix aims to provide between `rewind` (or starting from zero) and `seek`: https://3v4l.org/HBTeF

It still escapes me why "Line" is 0 at "Foo 1", but either way, it's consistent with the old implementation so it won't cause bugs in my application.

I can replace all my usage of `fgets()` with `current()` and `next()` instead, so that `key()` will remain consistent after `rewind()` or `seek()` with all PHP versions: https://3v4l.org/WYrig
 [2021-10-22 12:33 UTC] lucasfbustamante at gmail dot com
Is this expected? https://3v4l.org/s4ZKA

<?php

$file = new SplTempFileObject();

for ($i = 0; $i < 100; $i++) {
    $file->fwrite("Foo $i\n");
}

$file->seek(50);

echo json_encode(array(
    array('triggerNext' => $file->next(), 'line' => $file->key(), 'contents' => trim($file->current())),
    array('triggerNext' => $file->next(), 'line' => $file->key(), 'contents' => trim($file->current())),
    array('triggerNext' => $file->next(), 'line' => $file->key(), 'contents' => trim($file->current())),
), JSON_PRETTY_PRINT);
?>

Results:

PHP 8.0.1+
[
    {
        "line": 51,
        "contents": "Foo 50"
    },
    {
        "line": 52,
        "contents": "Foo 51"
    },
    {
        "line": 53,
        "contents": "Foo 52"
    }
]

PHP 5.1 - 8.0.0:

[
    {
        "line": 51,
        "contents": "Foo 51"
    },
    {
        "line": 52,
        "contents": "Foo 52"
    },
    {
        "line": 53,
        "contents": "Foo 53"
    }
]
 [2021-10-22 14:47 UTC] requinix@php.net
I don't believe that is correct.

It's a bit tricky because key/current/next/etc. are almost always called through iteration with foreach and never manually, and when calling them manually you're expected to follow a standard pattern:

for ($object->rewind(); $object->valid(); $object->next()) {
  $key = $object->key();
  $value = $object->current();
  // ...
}

or in other words,

1. rewind
2. valid
3. key/current
4. next
5. goto 2

However it should not matter if you call key/current before the first next because those are supposed to be read-only operations, yet here it does matter:
https://3v4l.org/BVMsV vs https://3v4l.org/s1G2E


Note that it is incorrect to call methods like next in the same statement as key/current because PHP does not guarantee that it will evaluate all operands in a statement in a particular order (though it virtually always does go left to right). That means with a line like

  array('triggerNext' => $file->next(), 'line' => $file->key(), 'contents' => trim($file->current())),

you can't actually be sure what methods will be called in what order, and having next before or after key/current makes a difference. The same problem exists with expressions like ($i++) + (++$i).
https://en.wikipedia.org/wiki/Sequence_point
 [2021-10-23 00:08 UTC] lucasfbustamante at gmail dot com
Indeed, the code snippets make much more sense in a foreach loop. I have re-written them in this format to make it clearer.

The whole picture starts to make sense. I will recap some of the obvious parts for someone from the outside that might be reading this.

If the iteration is starting from the beginning, all scenarios works consistently: https://3v4l.org/9BYb8

If the iteration is starting after a `seek`, PHP 8.0.1 fixed an issue where fgets() would start from the next line, but seems have introduced an inconsistent behavior if next() is called before current(): https://3v4l.org/UpN1G

In my personal opinion, I think it wasn't a good idea to fix the fgets bug, as it breaks many applications. I am here because I run a database importing script that uses seek() and fgets(), and I was getting a duplicated primary key error because 8.0.1 fetches one row above what 8.0.0 would.

I personally think it would have been best to document the behavior of seek + fgets returning the next line, and leave the behavior unchanged, considering it part of the behavior of SplFileObject, I am not sure if we are past the point of rolling back this fix.

Either way, if the behavior of next() then current() was not expected, I believe we should aim to make it consistent with the previous PHP versions, as to minimize impact on applications.
 [2021-10-23 13:09 UTC] rene at wp-staging dot com
Despite the fact that PHP 8.0.1 has been released already in January 2021 and is out for several months I highly encourage you to consider to roll back this change. This will have a huge negative effect on many projects if more hosting providers start to add PHP 8.X. Rolling back will affect only a minority. This is a modification which is not backward compatible at all and needs huge efforts for keeping projects cross compatible between 5.x/7.x and 8.0.1

On top of that, the doc has not been updated and still says that fgets() get the NEXT line while this is not true any longer: https://www.php.net/manual/en/splfileobject.fgets.php

Such a substantial change needs to be better documented. Such a change can break application nearly invisible. If it's not unit tested if input does not equals the expected output, users will not notice the failure in their applications. 
Chances are high that many affected projects even did not realized yet that their applications are not working properly anymore with >= PHP 8.0.1

If this will be rolled back, we would have time to work out a more consistent solution which covers every aspect of the SplFileObject and is either backward compatible or we at least can prepare better documentation with more samples that covers upgrading aspects to create cross compatible code.

Our team will gladly assist with that and we'd like to spent some of our resources on this.
 [2021-10-24 15:27 UTC] dharman@php.net
-Status: Feedback +Status: Not a bug
 [2021-10-24 15:27 UTC] dharman@php.net
The current behaviour looks correct to me. I have read all the comments but I don't see any bug that would need further fixing. We are not going to revert a bug fix just because it breaks existing applications that relied on the buggy behaviour. Relevant https://xkcd.com/1172/

If we consider a proper example of fgets it works correctly after the fix. I tried a few variations but I can't see any issues. https://3v4l.org/KPXcR

Considering your example, I believe the expectation is wrong. 

---------------
$file = new \SplTempFileObject();

for ($i = 0; $i < 100; $i++) {
    $file->fwrite("Foo\n");
}

// Puts the internal pointer at the start of line index 50 (0-based indexing)
$file->seek(50);

// Read the internal pointer value = 50
var_dump($file->key());
// Read the next available line (which is line at index 50)
var_dump($file->fgets());
// Read the internal pointer value = 50 (end of the line now)
var_dump($file->key());
// Read the next available line (which is line at index 51 as there's nothing more on line 50)
var_dump($file->fgets());
// Read the internal pointer value = 51 (end of the line)
var_dump($file->key());
---------------

The line pointer is incremented consistently since PHP 8. https://3v4l.org/m5Dhn
As such I am closing this bug report.
 [2021-10-25 02:34 UTC] lucasfbustamante at gmail dot com
I understand the reasoning behind the bug fix, and I accept marking this as "Not a bug".

I would like to make a plea, as a PHP user and as a community, that PHP as a language thinks about libraries, plugins and CMSs that have to support multiple PHP versions.

The way that a PHP developer writes code that needs to be compatible with a wide range of PHP versions is by avoid using language constructs that were introduced in newer versions. 

When different PHP versions returns different results given the same input, this makes it very hard or even impossible to keep compatibility with multiple PHP versions, which is often a requirement in the real world for a myriad of reasons, such as a website that uses libraries from different vendors, each with it's own range of accepted PHP versions.

You might ask "How I am going to fix bugs if I can't change the buggy behavior?"

My answer is that this is a valid question, and this should be considered case-by-case.

Talking specifically about the fgets bug fix, fgets had that behavior since PHP 5.1, released in 2005. The bugfix doesn't improve the code tremendously, it's almost a cosmetic change with dramatic effects. In that scenario, I think a warning in SplFileObject::fgets PHP manual documentation would have been more appropriate. I know we are past that point, but I'm just making a point for similar bug fixes in the future.
 [2021-10-26 09:38 UTC] cmb@php.net
> […] especially considering that this was introduced in a minor
> patch version bump, so PHP 8.0.0 will operate in a substantially
> different way than, say, PHP 8.0.1 or 8.1.

I have to admit that this is not proper semver, but at this point
in time PHP 8.0.0 is not supposed to be used by *anybody*.
However, reverting that commit now, would certainly cause issues.

And, for what it's worth, when working on this bug fix, I had
PHP-7.4 in mind, but since I noticed that I even had to adjust
existing test cases, I targeted PHP-8.0.

> I personally think it would have been best to document the
> behavior of seek + fgets returning the next line, […]

Maybe, but I'm not in favor of documenting inconsistent behavior,
and having it forever.  And some BC breaks (sometimes even hard to
work-around ones) are to be expected in major versions.

> If you use the SPLFileObject and depend on `seek()` and `key()`,
> […]

Consider to implement your own seek.  As is, ::seek() always
traverses from the beginning of the file, so the performance of
your own seek shouldn't be much worse, and depending on the use
case, could even be better.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Apr 23 17:01:31 2024 UTC