php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #39384 PHP assumes that an object will not be used after serialize() is called on it
Submitted: 2006-11-04 20:59 UTC Modified: 2006-11-08 14:59 UTC
From: cw264701 at ohiou dot edu Assigned:
Status: Not a bug Package: Class/Object related
PHP Version: 5.2.0 OS: Ubuntu Linux
Private report: No CVE-ID: None
 [2006-11-04 20:59 UTC] cw264701 at ohiou dot edu
Description:
------------
PHP assumes that I will not use an object after serializing it.  This shouldn't cause problems if my object's class does not define a __sleep() function, but if it does, and that __sleep() function modifies the object, then I can't reliably use that object until it is recreated using unserialize().

There is no mention of this in the documentation for the serialize() function, or anywhere else that I saw.  More importantly, if PHP expects me to *not* use an object after calling serialize() on it, then PHP should produce an error message if I *do* try to use that object before unserialization.

This is one of several problems (not all necessarily "bugs", but shaky designs), that I've come across recently, which greatly reduces the ability for PHP applications to take advantage of *transparency*.  I.e., I should not have to care how a class is implemented (for instance, whether or not it uses the magic __sleep() function) to make use of it.

I recently adopted the ezpdo (http://ezpdo.net/) ORM tool.  It has probably hurt my productivity more than it has helped because it makes use of such leaky abstractions.  Some of these may be the fault of that tool, but many flaws like this seem to be more general PHP problems.  (Sorry for the rant, but I think issues like this are pretty important, and the reason I very often become frustrated with PHP.)

Reproduce code:
---------------
<?php

class MultiplicationTable {

  public $size;
  public $table;

  public function MultiplicationTable( $size ) {
    $this->size = $size;
    for( $a = 1; $a <= $size; ++$a ) {
      for( $b = 1; $b <= $size; ++$b ) {
        $this->table[$a][$b] = $a * $b;
      }
    }
  }

  public function __sleep() {
    $this->table = null;
    return( array("size") );
  }

  public function __wakeup() {
    $this->MultiplicationTable($this->size);
  }
}

$mt = new MultiplicationTable(4);
echo $mt->size . ", " . $mt->table[4][4] . "\n";
$serialized_mt = serialize($mt);
echo $mt->size . ", " . $mt->table[4][4] . "\n";
$unserialized_mt = unserialize($serialized_mt);
echo $unserialized_mt->size . ", " . $unserialized_mt->table[4][4] . "\n";

?>

Expected result:
----------------
Well, ideally the object would still "work" after creating a serialize()'d version of it, but I think making that work would require significant changes to PHP's whole serialization model (or perhaps you could just have __wakeup() be called right after serialization; perhaps only if the object is accessed again).  But, the more realistic solution would probably result in some kind of error message when I try to access my $mt object after calling serialize() on it.

Actual result:
--------------
4, 16
4,
4, 16

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-11-06 12:27 UTC] s dot s at terra dot com dot br
The object still running as expected. If you change the attribute table, its sure that when you read it again the value will be the last assigned; null in this case.

Try to eliminate the line "$this->table = null;" and run the test again.

You dont need to set the attribute to null to serialize the object as you are doing. The magic method __sleep is used to persist only the attributes you whant, not the entire object ;)

From the docs:
-----------------
serialize() checks if your class has a function with the magic name __sleep. If so, that function is executed prior to any serialization. It **can** clean up the object and is supposed to return an array with the names of all variables of that object that **should** be serialized.
 [2006-11-06 15:27 UTC] cw264701 at ohiou dot edu
Yes, I understand that I need not set the "table" attribute to null.  This was a bad example; sorry.

My complaint isn't about a specific case of using serialize() and the magic __sleep() function.  I am complaining that the whole concept is flawed.  The PHP documentation is encouraging users to define this __sleep() function and, thus, modify their objects before they are serialized.  This is kind of silly, because, when we serialize() an object, that exact object is not being serialized, *itself*, but a *serialized representation* of that object is being formed (and stored as a string).  The intent of the __sleep() function is good: to allow some control over what is actually stored for a serialized version of an object.  The problem is, it can have *side effects*.

This really becomes a problem when you are working with a class that you didn't write.  For example, I am using ezpdo.  I carelessly (which should be okay in this case) serialize()'d one of my ezpdo-mapped objects before I was finished using it; things blew up.  It shouldn't be "wrong" for me to do something like this:
  $_SESSION['purchase'] = serialize($myPurchaseObject);
  $smarty->assign('purchase', $myPurchaseObject);
There, I attempt to store away my object in the session, just before I pass it off to my template engine for view rendering.  Perhaps this technique would be considered bad practice (to a PHP guru/developer), but that shouldn't leave me in the dark with some broken code.  When I use a class, I want to program to its *interface*, not an implementation; I shouldn't have to care whether or not it happens to define a __sleep() function, and therefore cannot be used after it has been passed to a call to serialize().

I understand the value of __sleep(), but I think the whole serialization library/interface needs re-thinking.  I suggest one of three solutions:
 - When an object, O, is serialize()'d, PHP could create an exact copy of that object (a "shallow copy", I believe), let's say C.  PHP would then call __sleep() for the new object, C, and serialize that instance of the class.  This technique seems slightly risky, though, because we have to depend on the class' __sleep() function to not directly modify any of its referenced objects, but rather rely on those references to use their own __sleep() functions to do any cleaning up; the class shouldn't directly modify its references, but who knows...
 - Call an object's __wakeup() method after a serialized representation has been formed but before returning from the serialize() function.  This method might be wasteful in many cases (where the object was never again used after serialization), but better safe than sorry.  Perhaps the __wakeup() function could be called on the object only if an attempt is made to "use it" after the call to serialize().
 - The quick-n-dirty solution would be to, simply, cough up an error message if an object is referenced *after* a call to serialize() has been made on it - *regardless of whether the object has an associated __sleep() method*.
 [2006-11-07 15:52 UTC] s dot s at terra dot com dot br
OK. You are right about the documentation, reading it again I got you point and it induce to unset values of the object when the __sleep() method is called.

From documentation:
"*It can clean up the object* and is supposed to return an array with the names of all variables of that object that should be serialized."

May this bug can be related to the documentation than the serialization or object architecture.

About change the serialization paradigm some problems that can get you into trouble serializing objects as "C" are: resources and references to other objects.

Think that your object hold a connection with database, when you serialize it this connection *should* be represented in some way or stored as a "C pointer". But when you unserialize, in other request to your httpd server, this resource *should* be restored and if its stored as a "C pointer" I think it will cause a segfault when you try to access this connection again, due to a read try on a memory segment that will have another informations thats not the database connection reference anymore or its owned by other process.

Same for object references. Where you have other problems such as recursion and/or cyclic references.

I don't really known how to store the reference "state" and I think storing it can lead to problems too. Think in a opened file in read or write mode. If the parser are able to store its "state" and restore it, the pre-requisite is that this file *should* exist and maybe unmodified due to internal file seek pointer. In the database case you need to get special care with user and password to database or it can lead to a security issue if you try to save something like the Pear DSN to represent the database connection.
 [2006-11-08 14:41 UTC] tony2001@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php


 [2006-11-08 14:59 UTC] cw264701 at ohiou dot edu
Okay, it's not a bug; it's a design flaw.  These kinds of issues are exactly the kind of crap that makes me frustrated with PHP and the reason I am going to try my damndest to steer clear of PHP in the future.

If you guys want to keep around the guru PHP developers, you need to give more thought to subtle but important problems like this one.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Jul 05 23:01:30 2024 UTC