php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #69725 foreach do not iterate all elements of an array in recursive function
Submitted: 2015-05-29 06:36 UTC Modified: 2015-05-30 04:25 UTC
From: dp dot maxime at gmail dot com Assigned:
Status: Not a bug Package: Scripting Engine problem
PHP Version: 5.6.9 OS: Ubuntu 15.04
Private report: No CVE-ID: None
 [2015-05-29 06:36 UTC] dp dot maxime at gmail dot com
Description:
------------
foreach construct do not iterate all elements of an array in recursive functions.

I am using PHP 5.6.9 from ppa:ondrej/php5-5.6 on Ubuntu 15.04.
The version 5.6.4 from the standard Ubuntu 15.04 distribution is also affected.

Test script:
---------------
<?php
function R(&$X, $level) {
    print "$level X: " . json_encode($X) . "\n";
    $A = $X;
    foreach($A as &$value) {
        $value = $value - 1;
    }
    print "$level A: " . json_encode($A) . "\n";
    $B = $X;
    foreach($B as &$value) {
        $value = $value + 1;
    }
    print "$level B: " . json_encode($B) . "\n";
  
    if ($level < 3) {
        R($B, $level + 1);
    }
}
$X = array(1, 2, 3);
R($X, 0);


Expected result:
----------------
0 X: [1,2,3]
0 A: [0,1,2]
0 B: [2,3,4]
1 X: [2,3,4]
1 A: [1,2,3]
1 B: [3,4,5]
2 X: [3,4,5]
2 A: [2,3,4]
2 B: [4,5,6]
3 X: [4,5,6]
3 A: [3,4,5]
3 B: [5,6,7]


Actual result:
--------------
0 X: [1,2,3]
0 A: [0,1,2]
0 B: [2,3,4]
1 X: [2,3,4]
1 A: [1,2,3]
1 B: [3,4,4]
2 X: [3,4,4]
2 A: [2,3,3]
2 B: [4,5,4]
3 X: [4,5,4]
3 A: [3,4,3]
3 B: [5,6,4]


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2015-05-29 07:01 UTC] requinix@php.net
-Status: Open +Status: Not a bug
 [2015-05-29 07:01 UTC] requinix@php.net
References. The explanation for why you get exactly those results is a bit complicated, I'll explain if you need it, but the short version is that $value is a reference to the last element in $A so when you set $value=$value+1 in the second foreach you'll also update $A which in turn also affects $X and $B in the recursive calls.

unset($value) after the first loop to destroy the variable and the reference, then do it after the second loop too so you learn to do that out of habit.
 [2015-05-29 14:19 UTC] dp dot maxime at gmail dot com
Sorry I don't think your assessment is correct.

Take a look on the level 0, i.e. first call of the recursive function, all works as expected, meaning it does not match your explanation. The problem arises when you go deeper into recursion.

If I replace &$value by &$val in the second foreach the problem persists. Though I confirm if you unset variables used in foreach constructs that gives what's expected. This is rather a workaround.

Perhaps you missed that both $A and $B created from $X with assignment operator, not the assignment by reference. Otherwise according to php manual (http://php.net/manual/en/language.operators.assignment.php ):

"Note that the assignment copies the original variable to the new one (assignment by value), so changes to one will not affect the other."

Regarding array assignment the manual states it explicitly (http://php.net/manual/en/language.types.array.php):
"Array assignment always involves value copying."

The manual do not specify that this behaviour changes in recursive functions.

Thus changes into $A must not affect $B and $X.
 [2015-05-30 04:25 UTC] requinix@php.net
Alright, the longer explanation. Walkthrough, if you will.

But first, a correction. When I read through the code I saw two problems:
1. A foreach loop using references, and a second foreach loop using the same "value" variable as the first
2. An array containing references being passed to a function which then updates the array and/or or copies of the array
Both of those actually stem from the fact that the "value" variable ($value) remains a reference after each loop ends, which is why what I said before still applies: unset($value) after each loop to fix the problem.

Either of those alone can result in unusual behavior. However what didn't quite click in my head was that the second loop used references too.
That means problem #1 isn't actually happening. Well, not that particular way. Normally the $value in the second loop would update values in the first loop, however (a) $A isn't being used after that second foreach ends so you wouldn't actually see a problem with it, but more importantly (b) when the second loop uses a reference for $value the previous reference relationship is discarded, and modifying $value will not update values in $A.

However problem #2 is still there, so that's the real cause of the unexpected output. As such the minimum fix is unset($value) after B's loop, but you should still do it after A's to form the habit.


Three important points about references to keep in mind:
- They deal more with variables (or array items) than with values. Both (or more) variables refer to the same data, so altering the data will affect both variables. Like how classes work as of PHP 5.
- Normal assignment of arrays does not literally copy values but copies "variables". For references, what you get a copy of is the reference, not of the referenced value at the time of the assignment. See http://3v4l.org/5TqMB for a demonstration. If the documentation seems to contradict that, please feel free to submit a patch to it describing that behavior in a way that makes sense to you. (Start with the Edit link in the top-right corner.)
- When you assign by-ref you make both parts involved be references. In that demonstration, though it makes $ref be the reference, $array1[2] is also changed into a reference. This is because references are bi-directional: $ref=&$array1[2] but also $array1[2]=&$ref. Unsetting one of those, or setting either to be a reference of something else, will destroy the other half of the reference. (If there are more than two places the reference is used, all but one must be unset for that to happen.)


Now to the code. I'm using & to indicate that a value is actually a reference.

R's $X is a reference that starts with [1,2,3]. $X is never modified directly, nor is the $X outside the function. That means passing it as a reference won't actually make a difference; if you had assigned $A or $B by-ref then you would be indirectly modifying $X too and using a reference would matter.
$A gets a copy of $X, nothing surprising there. Within the first foreach you decrement each value by 1 using a reference. When the loop ends, $value will continue to be a reference to the last element in the array. $A=[0,1,&2].
$B gets a copy of $X, and like before there's nothing surprising. The second foreach will begin setting $value=$B[0], $value=$B[1], and $value=$B[2], but $value will be a brand-new reference and so there will have no impact on $A. (Without the reference it would update $A[2] each time.) Once again, afterwards $value will be a reference to the last element in the array. $B=[2,3,&4]. Note that $A[2] stopped being a reference when this loop started because $value, the "other half", changed to be a reference to values in $B, thus "destroying" the reference in $A[2].

Then the recursive call. The new $X is a reference to $B, however (1) $X is still not being modified and (2) you don't check $B after the call. If you assigned $A or $B by-ref, thus creating a sort of "series of references" from the first $X to the current $A/B, and did another print after the call to R then you'd see $B had changed. So again, the reference won't make a difference here.

However there is something that does make a difference: $A and $B will be copies of $X, and $X was a copy of the previous $B, and $B[2] is still a reference. That means the current $X[2] and $A[2] and $B[2] will also be references - the same one, in fact. Modifying $A[2] will affect $B[2] and $X[2], and so on.

Now the first loop. Beforehand, $A=[2,3,&4]. The loop will act as it should and the result is $A=[1,2,&3], with $value again a reference to $A[2].

The second loop is where things get interesting. Beforehand, $B would have been [2,3,&4] just like $A was, except since $A[2] and $X[2] are the same reference, when you updated $A[2]=3 in the previous loop you also updated $X[2]=3. Thus you start with $B=[2,3,&3]. The loop does what you expect and you get $B=[3,4,&4], however $B[2] is the same reference as $X[2] and $A[2] so now $A=[1,2,&4].

Then the third call. $X=[3,4,&4]. $A starts at [3,4,&4] and gets updated to [2,3,&3] and $X[2]=3. Then $B starts at [3,4,&3], gets updated to [4,5,&4], and $X[2]=4.
Then the fourth call. $X=[4,5,&4], $A goes from [4,5,&4] to [3,4,&3] with $X[2]=3, and $B goes from [4,5,&3] to [5,6,&4] with $X[2]=$A[2]=4.

Code with more variable dumps (lots of output to sift through): http://3v4l.org/BPPTD
The behavior is more noticeable with a wider spread of values: http://3v4l.org/oQ3Cu

(Note that recent PHP 7 seems to have a bug with checking $A and $B from within their respective loops. I'm going to look into that.)

Complicated, huh? Anything I can clarify?
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Fri May 23 05:01:27 2025 UTC