php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #77343 Method to reply to yielded data from within a foreach loop
Submitted: 2018-12-23 21:41 UTC Modified: 2018-12-24 12:50 UTC
Votes:1
Avg. Score:3.0 ± 0.0
Reproduced:0 of 1 (0.0%)
From: stephan dot soller at helionweb dot de Assigned:
Status: Closed Package: *General Issues
PHP Version: master-Git-2018-12-23 (Git) OS:
Private report: No CVE-ID: None
 [2018-12-23 21:41 UTC] stephan dot soller at helionweb dot de
Description:
------------
I'm not sure if this is the right place to post such a feature request or if I should post an RFC in the wiki. Feel free to send me in the right direction if this isn't the place.

Right now generators a good for emitting data (`yield $data`) or for receiving data (`$generator->send($data)` on the outside and `$data = yield` in the generator). In my current project I stumbled upon a use case where I wanted to do both at the same time: Fetching mails via POP3, yielding them and if the outside loop processed the mail successfully the mail should be deleted.

I found the `Generator::send()` method in the documentation and coded something like this:

function fetch_mails(){
    // for each mail
        $action = yield $mail;
        if ($action == "delete")
            // delete mail
}

$mails = fetch_mails();
foreach($mails as $mail) {
    if ( !process_mail($mail) )
        $mails->send("delete");
}

The problem with this code is that foreach() and send() both advance the generator. Meaning each time send() is called one mail was also skipped. At first I thought it's a bug but after reading the docs I realized that this is by design: You either emit data to the outside or you receive data from the outside. Looks like there is nothing in place to do both at the same time. Others seem to have similar problems (https://stackoverflow.com/a/49314418, https://markbakeruk.net/2016/10/08/php-generators-sending-gotchas/).

A solution is to not use foreach but a while loop. Also you need to emit and receive values in tandem:

function fetch_mails(){
    // for each mail
        yield $mail;
        $action = yield;
        if ($action == "delete")
            // delete mail
}

$mails = fetch_mails();
while($mails->valid()) {
    $mail = $mails->current();
    process_mail($mail) ? $mails->next() : $mails->send("delete");
}

But it's rather difficult to figure out what's going on here and it's easy to break. So I dropped the generator and used it with a callback instead (with all the associated pros and cons).

What I really wanted is a way to provide the return values of the `yield` statement from within a foreach loop. Without advancing the iterator like `send()` does. I looked at the PHP source code and added a small `Generator::reply()` method that does the same as `Generator::send()` but without advancing the generator. The patch is attached and I wrote it based on the current PHP git master. With it the following code works:

function fetch_mails(){
    // for each mail
        $action = yield $mail;
        if ($action == "delete")
            // delete mail
}

$mails = fetch_mails();
foreach($mails as $mail) {
    if ( !process_mail($mail) )
        $mails->reply("delete");
}

I usually wouldn't have taken the time to post this. But to me this seemed like a useful pattern. You can use generators to encapsulate complex flow control. But sometimes you have to steer the generator from the outside. My current use case is a rather simple one (just delete the mail or not) but this could be a useful feature for complex iterations (e.g. of graphs, trees, file formats, network protocols). Similar to the C function nftw() (new file tree walk) where you can skip siblings or subtrees.

There are also details I'm not sure about:

a) The name. ┬┤reply()` simply was the first idea that came to mind.
b) `reply()` simply sets the return value of the current `yield` expression in the generator. So it can be called multiple times during the same iteration. Each call overwrites the previously set value. This makes the code simple but this might give users wrong ideas. Some maybe start to think that the generator gets advanced when you call `reply()` twice or more often. Maybe it would be best to throw an exception when `reply()` is called more than once per iteration. But that would require resetting the `send_target` slot to a known value when advancing the generator (e.g. UNDEF).

I can write up an RFC in the wiki if this is the way it's done. But I think someone else should look at it and decide if it's useful or not. If deemed useful I can also implement it properly (I think) but I still have to read up on PHPs internal memory management.


Patches

generator_reply.diff (last revision 2018-12-23 21:44 UTC by stephan dot soller at helionweb dot de)

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2018-12-23 22:56 UTC] requinix@php.net
You should post about this on the internals list: many more people will see it that way, and it creates a more suitable place to talk about it than here on this simple bug tracker.
 [2018-12-24 12:50 UTC] stephan dot soller at helionweb dot de
-Status: Open +Status: Closed
 [2018-12-24 12:50 UTC] stephan dot soller at helionweb dot de
Thanks for the quick answer, will do. I closed the request so it's off the list.
 
PHP Copyright © 2001-2019 The PHP Group
All rights reserved.
Last updated: Sun May 26 03:01:26 2019 UTC