PHP Generators – Sending “Gotchas”

If you’re reading this, you’re probably already aware of just how useful PHP’s Generators are for improving performance and/or reducing memory overheads while keeping your code clean and easy to read. We can use a simple foreach() in the main body of our code, as though iterating over an array, but without the overheads of needing to actually build an array. The Generator itself handles a lot of the boilerplate that we’d otherwise have to write, which also means that we create simpler, cleaner code.

Unlike their equivalent in some programming languages, PHP’s Generators allow you to send data into the Generator itself; not simply at initialisation (the arguments that we pass to the Generator when instantiating it); but also between iterations. This has its own uses, and again, allows us to move code from our main blocks and methods into the Generator itself.

Less commonly recognised is that we can combine these two features, creating Generators that can be used to provide data for our main code block and methods, while allowing us to send data to the Generator that can actually modify its behaviour dependent on circumstance.

However, there are a few “gotchas” when we combine Generators that both return and accept data in this way, and it really helps to be aware of them when we’re developing, otherwise it can create problems.

So if we start with a simple “incrementor” Generator, something like:


function adjustableIncrementor($value = 1, $increment = 1) {
    for($i = 0; $i <= 6; ++$i) {
        yield $value;
        $value += $increment;
    }
}

we can call it like:


foreach(adjustableIncrementor(0, 250) as $key => $value) {
    echo sprintf('%d => %s', $key, number_format($value)), PHP_EOL;
}

(overriding the default arguments with the values that we want to use in this instance) and the resulting output would appear as:

0 => 0
1 => 250
2 => 500
3 => 750
4 => 1,000
5 => 1,250
6 => 1,500

Note that the Generator is automatically generating a key for us for each iteration of the loop.

Of course, this is a pretty simplistic use of a Generator, and we could just as easily write that code without bothering with using a Generator; but bear with me, we’ll get there soon enough.


Now taking a quick look at a Generator that will accept data, the syntax would look something like:


function bitComposer($blockSize = 8) {
    $block = 0;
    for($bit = $blockSize - 1; $bit >= 0; --$bit) {
        $block += (yield) ? pow(2, $bit) : 0;
    }
    return $block;
}

and we call it by assigning the Generator object to a variable, and calling the send() method against that instance:


$bitData = [1, 1, 0, 1, 1, 0, 1, 1, 1]; // 439 decimal

$bitComposer = bitComposer(count($bitData));
foreach($bitData as $value) {
    $bitComposer->send($value);
}
echo $bitComposer->getReturn();

which gives us the expected output of 439.

Note that this particular example is taking advantage of the newer “return value” feature that was added to Generators in PHP7 to return the final result; but the method of sending has been available in PHP Generators since they were first introduced in version 5.5.

OK! So this isn’t a particularly useful example, simply sending the individual bit values from the data array into the Generator, where we aggregate those bits together and then return the final result as a decimal number. We could achieve the same more easily using bin2dec(); but I’m using it just to demonstrate the basic syntax for sending/yielding values into a Generator.


However, it becomes much more interesting when we combine the two together, using the send() method to change the behaviour of the Generator while actually iterating over it.
So let’s take our original “incrementor” Generator, and dynamically change the increment value that we use for each iteration, so that it’s no longer a constant increment.


function adjustableIncrementor($value = 1, $increment = 1) {
    for($i = 0; $i <= 6; ++$i) {
        $value += (yield $value);
    }
}

and we’ll call it using:


$incrementor = adjustableIncrementor(0, 0);

foreach($incrementor as $key => $value) {
    echo sprintf('%d => %s', $key, number_format($value)), PHP_EOL;
    $incrementor->send(++$value);
}

so within each iteration of the foreach loop we modify the increment that will be used for the next value. What we expect to see here is a set of returned values following a familiar binary sequence:

0 => 0
1 => 1
2 => 3
3 => 7
4 => 15
5 => 31
6 => 63

Effectively, each returned value should be 2^key - 1

However, let’s take a look at the actual output:

0 => 0
2 => 1
4 => 3
6 => 7

WTF ?!? That’s not what we expected! Nothing like what we expected to see. And therein lies our big “gotcha”.

So what’s actually happening here? Let’s add some additional debugging outputs inside our Generator to find out:


function adjustableIncrementor($value = 1, $increment = 1) {
    for($i = 0; $i <= 6; ++$i) {
        $increment = (yield $value);
        var_dump($increment);
        $value += $increment;
    }
}

And if we run our loop again, we see something interesting:

0 => 0
int(1)
NULL
2 => 1
int(2)
NULL
4 => 3
int(4)
NULL
6 => 7
int(8)

Every iteration of our calling loop seems to yield 2 values into the Generator, not just the value that we’re sending, but an additional null value.

I’ve not confirmed this with any of the PHP core developers, but my thinking is that the implicit call to the Generator’s next() method in the foreach loop sends an “implicit” null to prevent blocking in case your application has failed to call send() itself. Somehow, when both yielding from the Generator and sending to the Generator, this triggers a second iteration of the Generator’s internal loop, so the automatically generated key is incremented again.

Whatever the reasoning behind this, we need to take control of the key and manage it ourselves when we write a Generator that works with yielding both in and out of the Generator.


function adjustableIncrementor($value = 1, $increment = 1) {
    for($key = 0; $key <= 6; ++$key) {
        $increment = (yield $key => $value);
        if (is_null($increment)) {
            --$key;
        }
        $value += $increment;
    }
}

Now, finally, we get the result that we expected:

0 => 0
1 => 1
2 => 3
3 => 7
4 => 15
5 => 31
6 => 63

We also need to make allowance for the double loop potentially affecting the value that we are calculating. In my example here, we’re using an addition to update each step value, and an additional + 0 in the Generator loop won’t affect our result ($value + 0 simply results in $value, so no damage done); but if we were using multiplication to calculate a product, then * 0 would give 0 values when it yielded them back to our main calling code for all but the first iteration ($value * 0 gives 0, which is definitely not what we want).

A Generator that does multiply by the value sent into it each iteration:


function adjustableIncrementor($value = 1, $increment = 1) {
    for($key = 0; $key <= 6; ++$key) {
        $increment = (yield $value);
        $value *= $increment;
    }
}

and called using:


$incrementor = adjustableIncrementor(1, 2);
foreach($incrementor as $key => $value) {
    echo sprintf('%d => %s', $key, number_format($value)), PHP_EOL;
    $incrementor->send(++$value);
}

results in:

0 => 1
2 => 0
4 => 0
6 => 0

where not only are the keys incorrect, but the values are also unexpected.

But awareness of the additional null sent into the Generator on every iteration of the calling loop allows us to rewrite it as:


function adjustableIncrementor($value = 1, $increment = 1) {
    for($key = 0; $key <= 6; ++$key) {
        $increment = (yield $key => $value);
        if (is_null($increment)) {
            --$key;
        } else {
            $value *= $increment;
        }
    }
}

and now when we run the calling code, we get the correct results returned:

0 => 1
1 => 2
2 => 6
3 => 42
4 => 1,806
5 => 3,263,442
6 => 10,650,056,950,806

Of course, all this adds more complexity to our Generator code; but it does allow us to create powerful and useful Generators, maintaining the complexity there, while keeping the calling code clean and simple to use.

We can actually take advantage of the non-blocking nature of this, and only send a value into the Generator when we want to actually change its behaviour, not necessarily on every iteration of our calling loop.


function flag($size) {
    $i = $key = $step = 1;
    do {
        $ascending = (yield $key => $i);
        if ($ascending === false) {
            $i += $step;
            $step = -1;
            $key--;
        }
        $i += $step;
        $key++;
    } while($i > 0);
}

$size = 5;
$flag = flag($size);
foreach ($flag as $key => $value) {
    echo sprintf('%2d', $key), ' => ', str_repeat('*', $value), PHP_EOL;
	if ($key == $size)
        $diamond->send(false);
}

will increment every iteration until we tell it to decrement, and then it will continue to decrement every iteration until the end condition of the do loop in the Generator is met; so we only need to execute the send() once. Our output from the flag Generator looks like:

 1 => *
 2 => **
 3 => ***
 4 => ****
 5 => *****
 6 => ****
 7 => ***
 8 => **
 9 => *

These over-simplistic examples don’t achieve a great deal of business value themselves; but they do demonstrate some of the power of Generators when combining “yielding from” and “sending to” together. And hopefully now, you’re forewarned of the difficult pitfalls that could otherwise be hard to debug.

This entry was posted in PHP and tagged , . Bookmark the permalink.

4 Responses to PHP Generators – Sending “Gotchas”

  1. Pingback: PHP Annotated Monthly – November 2016 | PhpStorm Blog

  2. Many thanks really handy. Will certainly share website with my pals

    Like

  3. Bob Weinand says:

    The alternative solution if you mean to send in data is just checking the generator yourself for validity:

    while ($gen->valid()) {
    $value = $gen->current();
    # Use $value
    $gen->send(++$value);
    }

    I really wouldn’t hack my way around foreach – it just isn’t intended for the case of sending in values (and this even doesn’t work if null ever has special meaning).

    Thus, if you have to send in data, just use the primitives current() and valid(); they’re explicit and not exhibiting any hidden gotchas.

    Like

    • Mark Baker says:

      Although using the native Generator valid(), current(), next(), etc methods does reduce the simplicity of the foreach loop in the calling code, something that most PHP devs are familar with for working with existing arrays or SPL Iterable objects…. and yes, if NULL ever has special meaning, then you do need to be explicit.

      I was more interested in why PHP introduced the implicit sent null, and it took me a while to realise that it was to ensure non-blocking of the Generator; but if anybody else does get “stung” by this behaviour, then hopefully they’ll be able to understand why from this post.

      Like

Leave a comment