php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #76863 Creating array reference grants direct access to the property
Submitted: 2018-09-11 19:14 UTC Modified: 2018-09-11 22:43 UTC
From: tshumbeo at mailhouse dot biz Assigned:
Status: Not a bug Package: Scripting Engine problem
PHP Version: 7.2.9 OS: Linux and Windows
Private report: No CVE-ID: None
 [2018-09-11 19:14 UTC] tshumbeo at mailhouse dot biz
Description:
------------
Tested on PHP 5.6.37 and 7.2.9, Windows and Linux, various configurations.

See the test script. When (1) is present (assignment by reference) and (2) is missing (no unsetting), it appears that K->$a[0] itself becomes a reference (not $unusedVariable, which would be expected). When K->a() - a typical read accessor method - returns the array, modifying the array's member affects the K->$a.

If (2) is present, then the $a[0] reference is removed and it's no more possible to change the K->$a by changing K->a() members.

(1) | (2)   | Result
----|-------|-------
 &  | unset | no bug
    | unset | no bug
    |       | no bug
 &  |       | bug!

A practical consequence: I am unable to create a reference to the internal array's members (for use in other places) without opening doors to direct modification of the property which can happen inadvertently (this is how I stumbled upon it):

  function clean(array $ar) {
    foreach ($ar as &$ref) $ref = ...;
    return $ar;
  }

  clean($k->a());

Do note that clean() doesn't accept $a by reference, it simply uses it as a copy for returning in the result. However, due to the bug it in fact directly modifies $k's internal $a array!


Test script:
---------------
class K {
  private $a = ['a'];

  function a() {
    return $this->a;
  }

  function f($f) {
    // (1) 
    $unusedVariable = &$this->a[0];
    // (2)
    //unset($unusedVariable);
    $f();
  }
}

$k = new K;

$k->f(function () use ($k) { 
  foreach ($k->a() as &$ref) { $ref = 123; }
});

// Expected result: string 'a'.
// Actual result: int 123.
var_dump($k->a());

Expected result:
----------------
Creating a reference *to* an array member (1) should *not* turn that member into a reference (to itself?).

Actual result:
--------------
Creating a reference turns that member into a sticky reference that isn't removed by array cloning functions like array_values(), array_merge(), unset(), etc. and can be removed only by unsetting (2) the referencing variable (even if it's out of scope the reference remains).

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2018-09-11 19:21 UTC] spam2 at rhsoft dot net
> reating a reference *to* an array member (1) should *not* 
> turn that member into a reference

that assumption is simply wrong

the golden rule in PHP is not to use references at all until you really know what you are doing, are arwa about all side-effects and are relly sure given the copy-on-write nature of PHP you gain anything

below a sample for side-effects of references which i show everyone who thinks references are a god think - don't confuse PHP references with C pointers

----
BAD:

<?php
 $array1 = [1, 2, 3];
 $array2 = [5, 6, 7];
 foreach($array1 as $key=>&$item)
 {
 }
 foreach($array2 as $key=>$item)
 {
  $item = "MY GOD $key";
 }
 print_r($array1);
 print_r($array2);
 $item = 'GO AWAY';
 print_r($array1);
?>
-------
CLEAN BUT USELESS:

<?php
 $array1 = [1, 2, 3];
 $array2 = [5, 6, 7];
 foreach($array1 as $key=>&$item)
 {
 }
 unset($item); /** FIX */
 foreach($array2 as $key=>$item)
 {
  $item = "MY GOD $key";
 }
 print_r($array1);
 print_r($array2);
 $item = 'GO AWAY';
 print_r($array1);
?>
 [2018-09-11 19:53 UTC] tshumbeo at mailhouse dot biz
>  foreach($array1 as $key=>&$item)
>  foreach($array2 as $key=>$item)
This is solved simply by sticking to a naming convention for all reference variables. Example: always name them $ref and you will never have problems with references from different scopes. 

Anyway, I am not new to PHP or references and my case is very practical, providing a simple solution to a certain problem. I'd like us to focus on it rather than dismissing references as a useless language construct (does it mean bugs don't have to be fixed?).

  class ArgWrapper {
    private $args = ['a', 'b', 'c'];
  
    function args() {
      return $this->args;
    }

    function call($func) {
      // Obviously this would be a for loop.
      $args = [&$this->args[0], &$this->args[1], &$this[2]];
      call_user_func_array($func, $args);
    }
  }

The above class makes it possible to read arguments (e.g. for some event/filter system) via accessor method(s) - args(). And for convenience it provides access to arguments for its event handlers as function arguments which can be taken by reference, similar to array_walk(), etc.

  (new ArgWrapper)
    ->call(function (&$ref) {
      $ref = 123;
    });
  
How to achieve the same without using references and without using special methods to get/set individual arguments (which is possible but not as convenient)? Architectural implications here are not the point, only the practical case.
 [2018-09-11 21:06 UTC] spam2 at rhsoft dot net
> This is solved simply by sticking to a naming convention for all reference variables

no, this is solved by not use referemces at all except "multi return values of functions" because most of the time they are not doing what you think they are doing and you gain nothing but headache
 [2018-09-11 22:42 UTC] a at b dot c dot de
"it appears that K->$a[0] itself becomes a reference (not $unusedVariable, which would be expected).".

References are completely symmetric:

$b = 'foo'; $a = &$b;

and 

$a = 'foo'; $b =&$a;

have exactly the same result: Two variables, $a and $b, bound to the same content. It doesn't matter which one was made first, and there isn't any limit on how many references can be made to the same content. Alterations to the content made by assigning to any one of those references is always reflected when accessing it via any of the others.

If you make a reference to (a reference to a reference to...) a private property of an object, then yes: modifying its value will change the value of the private property. And vice versa. That's what references do. And if you don't unset them they will keep doing it.

It's like hardlinks in a filesystem. Every file has at least one hardlink because that's how the file is accessed; but a file can have multiple hardlinks. Alterations to the file via one hardlink will be reflected when accessed via any of the others.

Unsetting references is important; the manual spends a full page making that point.
 [2018-09-11 22:43 UTC] requinix@php.net
-Status: Open +Status: Not a bug
 [2018-09-11 22:43 UTC] requinix@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

> Creating a reference *to*
That is the fundamental misunderstanding here: references do not go "to" other variables. A reference means that two or more values refer to the same underlying data in PHP's internals. If you're familiar with the concepts, they're hardlinks not symlinks.
http://php.net/manual/en/language.references.php
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Mar 29 01:01:28 2024 UTC