|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2004-09-06 05:54 UTC] fletch at pobox dot com
Description:
------------
foreach with a reference seems to corrupt the last element in an array
Reproduce code:
---------------
<?php
$array = array(1,2,3);
foreach( $array as &$item ) { }
print_r( $array );
foreach( $array as $item ) { }
print_r( $array );
?>
Expected result:
----------------
Array
(
[0] => 1
[1] => 2
[2] => 3
)
Array
(
[0] => 1
[1] => 2
[2] => 3
)
Actual result:
--------------
Array
(
[0] => 1
[1] => 2
[2] => 3
)
Array
(
[0] => 1
[1] => 2
[2] => 2
)
PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Mon Oct 27 22:00:01 2025 UTC |
I believe the problem I'm gonna describe here has the same roots as the one fletch has described ... Try this: <? class ArrayWrapper { protected $slots; public function __construct($slots = Array()) { $this->slots =& $slots; } public function & getSlot($idx) { if(IsSet($this->slots[$idx])) return $this->slots[$idx]; return null; } public function getLength() { return Count($this->slots); } } // fill the array and create object { $slots = Array( 'zero', 'one', 'two', 'three', 'four' ); $aw = new ArrayWrapper($slots); // } // output A var_dump($aw); // iteration 1 for($idx = 0; $idx < $aw->getLength(); $idx++) $aw->getSlot($idx); // output B; everything is OK var_dump($aw); // iteration 2 for($idx = 0; $idx < $aw->getLength(); $idx++) if($aw->getSlot($idx)) { } // output C; elements have been changed to references var_dump($aw); ?> As you can see in output "C" the second iteration altered elements of the array - the elements have been turned into references. (Or did I get the ampersand-sign wrong?) The problem is that I loose control over the REAL objects and the elements get changed unreasonably later in the script. I kind of understand what's the diference between getSlot() and getSlot() used within an if-loop (I guess the scope created by if-loop makes the difference), but I'm not sure this difference entitles PHP to alter my array. Can anyone explain this to me? Is it a bug or a regular behaviour?There is no corruption. It is easy to explain this behaviour. Take the following: $arr = array(1 => array(1, 2), 2 => array(1, 2), 3 => array(1, 2)); foreach($arr as &$value) { } foreach(array(1,2,3,4,5) as $key => $value) { } echo $test[3]; After the first foreach() loop what you have in $value is a reference to the last element in $arr (here: to array(1,2)). Now, when the next foreach loop assigns a value to $value, it assigns this value to where $value points, that is: to the last position of $arr. Now in the second foreach() the last item of $arr becomes first 1, then 2, etc and in the end, what you get as output by this program, is: 5 Not the expected: Array Now this is indeed very confusing if you don't know what's going on. Solution: unset $v before a new foreach() loop (maybe this could be done by php by default?)To elaborate of why I strongly feel this is a bug and not a 'feature': $a = array('a', 'b', 'c', 'd'); foreach ($a as &$v) { } var_dump($a); One would expect that every element of $a is a string. Well it was, up until I did that foreach-with-reference. That changed the last element into string reference: array(4) { [0]=> string(1) "a" [1]=> string(1) "b" [2]=> string(1) "c" [3]=> &string(1) "d" } The PHP guys can claim that this is correct behavior all they want, but it is fundamentally wrong from a design perspective that an array changes, without doing anything to its elements.daniel, unsetting it before might also cause a wrong result. Simple example: <?php function search_element($array, &$var) { foreach ($array as &$var) { if ($condition) { return true; } } } $array=array(1,2,3,4,5,6); $var = null; search_element($array, $var); ?> This is simplified and probably no good architecture but such designs might make sense in some situations and breaking that adds a larger inconsistency than the current "surprising but consistent" behavior. martijn, Of course there is a reference at the end, absolutely expected and consistent with the language.No matter how you paint this, it isn't a bug. What you are really asking for is some sort of block-level scoping. As such, it should be proposed as a feature request, but it would be a major change in the language. The simplest way to illustrate that is to just unroll the loops. Take this example: $a = array(1,2); foreach($a as &$b) { } foreach($a as $b) { } print_r($a); The problem here is that people find it inconsistent that the 2nd loop changes $a. But if we unroll the loops and write the exactly equivalent code without the foreach construct we get: $a = array(1,2); // First loop, $b is a reference to each element in $a $b = &$a[0]; $b = &$a[1]; // Second loop, $b is assigned the value of each element in $a $b = $a[0]; $b = $a[1]; Those two pieces of code are identical in every way. The thing that confuses people is that $b is still a reference $a[1] going into the second loop. Since PHP doesn't have block-level scoping, there is nothing in the language that would permit $b to be unset between the two loops without introducing a major inconsistency. In fact there is plenty of code that relies on this fact which would break if we made such an arbitrary change. I suppose what you are asking for is syntax along the lines of: $a = array(1,2); { local $b = &$a[0]; $b = &$a[1]; } { local $b = $a[0]; $b = $a[1]; } Where $b is locally scoped in each of those blocks and it might look like this in a foreach case: $a = array(1,2); foreach($a as local &$b) { } foreach($a as local $b) { } Without such a scoping syntax change, something as simple as: forach(array(1,2,3) as $b) { } echo $b; where the code fully expects $b to be 3 would break.What do you mean you con't care about the explanation? Ok, simple question then. Do you expect this to output 3? foreach(array(1,2,3) as $b) { } echo $b; If you do, then you don't want us to fix this "bug" because fixing it would mean $b is not 3 here.Rasmus, Thanks for looking at this. I found the problem. I would still call it a bug, but I will describe it and you can decide. You are the man after all. You were right, I was passing a variable by reference in the few lines of code in front of my example above: foreach($this->data['external_'.$type] as &$item){ if(!empty($item['package'])){ $packages[] = $item; $library_names[] = $item['library_name']; unset($item); } } /* Code in example above goes here */ BUT, where I see this as a bug was: $packages (the array that was getting changed) was a new array created from the data of each $item. $packages was never being referenced, though the variable $item it was created from was. So, it should be a copy of the data and not THE data right? I fixed it by simply not trying to pass by reference and changing unset($item) to unset($this->data['external_'.$type]). Looking at it, that was the way to do it from the beginning. I see that, but I'm not sure why $packages gets changed down the road (it was correct immediately after this, but it and all copies of it change inside the next foreach). Any thoughts?rasmus@php.net asked "Ok, simple question then. Do you expect this to output 3?" foreach(array(1,2,3) as $b) { } echo $b; I would much prefer it not to output 3. Personally I think it would make a lot more sense and be a lot safer to have the array element references scoped to the foreach block - so effectively being unset after the block has run. Having the last element of the array floating around outside of the block is very dangerous in my view and can lead to silent errors. As someone else mentioned, I hate to think how much incorrect data there is out there because of the last array element being accidentally changed outside of the block. derick@php.net rather flippantly said: "no, we can't unset it by default, as people might use this for some weird reason." I can think of plenty of non-weird reasons why people might want this behaviour. But if it was unset by default, it's a simple matter to assign the reference to a variable defined outside of the block thereby making it available outside the foreach. In other words, like this: $c = NULL; foreach(array(1,2,3) as $b) { $c = $b; } unset($b);// simulates block-scoping of $b echo $c; This is not a bug, but I believe it's dangerous behaviour of PHP as it would seem quite logical to assume that the element references are scoped to the foreach block only - witness the many comments in this thread to that effect. So my vote would be to change this behaviour to block-scoping in a future version.I appreciate the explanation that Rasmus provides -- thank you! One small but troublesome detail: The first foreach changes the array by making $a[1] a reference variable while $a[0] remains a normal variable. $a = array(1,2); foreach($a as &$e){} var_dump($a,$e); // $a[1] == &int 2 $e == 2 foreach($a as $e){} $a[1] == &int 1 $e == 1 var_dump($a,$e); // $a[1] now points to last value of $a which is $a[0] How about adding a switch so that users who don't want or understand this behavior can turn it off? Then it would be up in front of the documentation and would be less liable to be overlooked by users who fail to scroll down to the colored box. Even if PHP were to have lexical scope (how hard would that be to implement and why can't PHP evolve that way?), that doesn't change the fact that the first loop doing seemingly nothing, does change the array.I understand this functionality, and I do agree that it is not a bug. It seems at the core of PHP that this is what would happen, but it does seem very unintuitive to me having used a variety of other languages. The result is not expected and has caused several very hard to find bugs for me. Would it be possible to have PHP generate a E_NOTICE when using the same $var in both a foreach and afterwards when in a higher scope? EG: foreach($args as &$a){} $a = 'hello'; // this would generate an E_NOTICE Then maybe have the option to turn off (or on by default) the E_NOTICE warnings in the ini settings?I don't think this is going to go anywhere - seems to have reached a stalemate. So I have just retrained my head to automatically create foreach loops thus: foreach($array as $item){ }unset($item); If you need access to the last $item outside the loop, then just do it somewhere before the unset($item). Seems to me this thread is being accessed periodically by developers scratching their heads after discovering similar oddities happening with their foreach loops. My advice would be to do something similar to the above and just live with it.I still say this is a bug, so here's all the code you need to re-produce it: <pre> <?php $packages = array(); $library_names = array(); $ext_js = array( array( 'name' => 'myname1', 'attrib' => 'something1', 'package' => true, 'library_name' => 'package1' ), array( 'name' => 'myname2', 'attrib' => 'something2', 'package' => true, 'library_name' => 'package1' ), array( 'name' => 'myname3', 'attrib' => 'something3', 'package' => false ), array( 'name' => 'myname4', 'attrib' => 'something4', 'package' => false ) ); foreach($ext_js as &$item){ if(!empty($item['package'])){ $packages[] = $item; // Not using &, so should be a copy, not a reference corrent? $library_names[] = $item['library_name']; unset($item); } } if(!empty($packages)){ /*A*/ print_r($ext_js); foreach($packages as $item){ /*B*/ print_r($ext_js); } } ?> </pre> Look at the output on the last item. Instead of unset removing the item from the array $ext_js (which is what I thought it would do), it corrupts the array. The array is fine though unless you go in to another foreach using another $item. Changing the variable name on the second foreach to something OTHER than $item (I used $itemm) fixes it. Bug.I understand that my explanation above isn't 100% accurate (unset really doesn't have anything to do with it), but that doesn't change that the expected behavior is not working. Rasmus said: "Ok, simple question then. Do you expect this to output 3? foreach(array(1,2,3) as $b) { } echo $b; If you do, then you don't want us to fix this "bug" because fixing it would mean $b is not 3 here." I say in the example above, would you expect a print_r of the same array to return 1,2,2? My issue was that the content of the entire array. Honestly I've been programming since I was eight years old. That was 1985. If this was confusing the hell out of me then something's wrong. Here's an even simpler example: <pre> <?php $clean = array(1,2,3,4); foreach($clean as &$item){ // Nothing } echo "A:\n"; /*A*/ print_r($clean); echo "B:\n"; foreach($clean as $item){ /*B*/ print_r($clean); } $clean = array(1,2,3,4); foreach($clean as &$item){ // Nothing } echo "C:\n"; /*C*/ print_r($clean); echo "D:\n"; foreach($clean as $not_item){ /*C*/ print_r($clean); } ?> </pre> A and B output the same array differently with no modification (not expected). C and D are the same (expected). The only change was not re-using the name $item. How can we make it so that using &$item in one foreach and then using the same variable name ($item) in a different foreach does not change the original array?OK, I went over this some more. <pre> <?php // Fresh array $clean = array(1,2,3,4); foreach($clean as &$item){ // Nothing is modified in the array, but $item now exists } /*############################################################################## * $item persists outside of foreach and is now $clean[3] * See the warning on http://php.net/manual/en/control-structures.foreach.php * print_r($item); // would return 4 you you uncommented this. * unset($item); // This would remove the pointer to $clean[3]. Expected. *############################################################################*/ echo "A:\n"; /*A*/ print_r($clean); // $clean is still unmodified echo "B:\n"; foreach($clean as $item){ /*############################################################################## * Using AS $item SETS $item TO the current $item value (a.k.a. $clean[0], etc.) * Essentially foreach($clean as $item) is short hand for something like: * $x=0;while($x < count($clean)){$item=$clean[$x]; ### your code ### $x++;} * The problem I had was that I did not expect foreach to be able to set on call *############################################################################*/ /*B*/ print_r($clean); } ?> </pre> So creating the variable is documented, and it isn't a bug. The ability to set the value could be made clearer though.I have done this intentionally in some very lazy prototypes. foreach($data as $k => &v) { if (meets_complex_precondition($k)) { break; } } // do some stuff with $v potentially modifying it later unset($v); Comes in handy on occasion when working recursively by reference in deep arrays. Neat trick like 'array_map(null, $arr1, $arr2)'.I have the same problem with php 5.6 on OpenServer. Before I never meet it; I don't use reference. But foreach do the same with my array. So: foreach ( $items as $item ) { //..echo single $item } Last element of $items is disappeared;