|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #60941 Procedural serialization violating encapsulation, insecure and/or unclear
Submitted: 2012-01-31 23:27 UTC Modified: 2014-06-09 05:35 UTC
Avg. Score:3.0 ± 0.0
Reproduced:0 of 0 (0.0%)
From: chealer at gmail dot com Assigned:
Status: Open Package: *General Issues
PHP Version: Irrelevant OS:
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2012-01-31 23:27 UTC] chealer at gmail dot com
The procedural serialization offered via the built-in unserialize() and serialize() built-in functions has architectural issues that violate encapsulation and can cause vulnerabilities in certain cases. This wide report documents these and problems in unserialize()'s documentation, but remedies may not involve only documentation changes.

First, unserialize()'s documentation is very unclear:

unserialize — Creates a PHP value from a stored representation
mixed unserialize ( string $str )

unserialize() takes a single serialized variable and converts it back into a PHP value.


    The serialized string.

    If the variable being unserialized is an object, after successfully reconstructing the object PHP will automatically attempt to call the __wakeup() member function (if it exists).


The converted value is returned, and can be a boolean, integer, float, string, array or object.

In case the passed string is not unserializeable, FALSE is returned and E_NOTICE is issued.

None of this explains the format expected. The description simply assumes the string passed is "a single serialized variable". This does not even explain that the format is generated by serialize(), which is not referred to until the See also section.

This assumption hides the fact that the string given may not represent a valid instance of a class. unserialize() may create an object that a corresponding call to serialize() could never have serialized, since the deserialization simply imports data members bypassing class methods without validation.

Equally grave, the type of the value created by unserialize() is embedded in its input string. There is no other way to specify the expected type, which means that calling unserialize() on entirely untrusted data can lead to the creation of a value of any defined type, including all classes. Therefore, even though the description says unserialize() "creates a value", calling it can result in user code execution via the __wakeup() magic method, and indirectly via __destruct().

The combination of these issues makes unserialize() an attack vector, which - as shown in slides 32-45 from - is not simply theoretical.

This is not a request to document the serialization format. Although this format should probably be changed, the magical approach is fine as long as it is properly documented.

Serialization is mainly used with arrays and objects. For objects, PHP 5.1 offers a modern alternative in Serializable. However, I do not see a real equivalent for native arrays at first sight. In my uninformed opinion, the current procedural mechanism should probably be replaced/eliminated. If there are no plans for this already, I would recommend having a project discussion.

In the short term:
The examples of serialize() and unserialize() make no sense on their own. They should be grouped on the Object Serialization page:
That page already has one example. It should be the place to explain how serialize() and unserialize() can be used safely. unserialize()'s documentation should have warnings. As things stand, even a reader looking for security issues would likely not realize the problems. Currently, unserialize()'s documentation page doesn't directly link to Object Serialization.

unserialize() takes a single serialized variable and converts it back into a PHP value.
should read:
unserialize() takes a string representing a serialized value and converts it back into a PHP value.

The serialized string.
should read:
The string representing a serialized value.


Add a Patch

Pull Requests

Pull requests:

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2014-02-12 17:21 UTC]
I don't think that this bug should be kept as private, as this attack vector is pretty well-known.
I think it would be nice having a discussion about potentially changing this in the next major version, so feel free to bring this up on the mailing list.

The unserialize documentation also has a big red warning now:
"Do not pass untrusted user input to unserialize(). Unserialization can result in code being loaded and executed due to object instantiation and autoloading, and a malicious user may be able to exploit this. Use a safe, standard data interchange format such as JSON (via json_decode() and json_encode()) if you need to pass serialized data to the user."

But maybe we could be more explicit about how can such an exploit happen (__wakeup, etc.).
 [2014-02-12 21:26 UTC]
I've made RFC to mitigate it some time ago:
but it was not met with much enthusiasm. Maybe some time in the future it will receive more interest.
 [2014-02-17 05:17 UTC] chealer at gmail dot com
The warning is a huge improvement. I agree it should be more explicit on how unserialize() can be exploited. Perhaps more importantly, it should be more prominent than it is at the end of the page. I would see it above the changelog, maybe in the Parameters section when str is described.

I don't think I intentionally made this ticket private. In any case, I agree it should be public.

Thank you very much for the RFC, stas. It does allow safe unserialization of arrays, which completes the safe alternatives to unserialize() needed. This solves the problem for programmers who are aware of unserialize()'s risk, but not for others. A safe fix would simply require the second argument to default to false. There's a choice to make between backwards compatibility and security. I suggest an allow_unrestricted_unserialize directive to let administrators make that choice.

Otherwise, the question which remains is whether supporting an array as second argument is worth the extra complexity. In other words, how often is unserialize() called in a context where it may legitimately receive different classes (which are not subclasses of a common ancestor)? I haven't looked at enough use cases to answer that question.
 [2014-06-09 05:35 UTC]
-Type: Security +Type: Feature/Change Request
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 12 21:01:30 2024 UTC