php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #52458 spl_object_hash() has strange behaviors with SimpleXML
Submitted: 2010-07-27 16:01 UTC Modified: 2010-07-28 14:18 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:1 (100.0%)
From: ivan dot enderlin at hoa-project dot net Assigned:
Status: Not a bug Package: SimpleXML related
PHP Version: 5.3SVN-2010-07-27 (SVN) OS:
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: ivan dot enderlin at hoa-project dot net
New email:
PHP Version: OS:

 

 [2010-07-27 16:01 UTC] ivan dot enderlin at hoa-project dot net
Description:
------------
Hey :-),

spl_object_hash() has strange behaviors with SimpleXML.
You know that SimpleXMLElement uses properties to access to its children collection and array-access to reach a specific child into this collection.
So, if we want to reach an element, we have to do: $sxe->element[0] for example. Unfortunately, it appears to always return a “new object” according to spl_object_hash(). Please, see the code below.

You can notice that $sxe->p has sometimes the same has than $sxe-p[0]. And why $sxe-p[0] has most of the time a different hash?

Test script:
---------------
<?php

$xml = '<?xml version="1.0" encoding="utf-8"?>' . "\n\n" .
       '<page>' . "\n" .
       '  <p>Foobar</p>' . "\n" .
       '</page>';


$sxe = simplexml_load_string($xml);

function f ( $e ) {

    return spl_object_hash($e);
}

var_dump(f($sxe->p));
var_dump(f($sxe->p));
var_dump(f($sxe->p[0]));
var_dump(f($sxe->p[0]));
var_dump(f($sxe->p[0]));
var_dump(f($sxe->p[0]));
var_dump(f($sxe->p[0]));
var_dump(f($sxe->p[0]));

echo "\n" . 'Light!' . "\n\n";

function g ( $e ) {

    return substr(f($e), 14, 2);
}

var_dump(g($sxe->p));
var_dump(g($sxe->p));
var_dump(g($sxe->p[0]));
var_dump(g($sxe->p[0]));
var_dump(g($sxe->p[0]));
var_dump(g($sxe->p[0]));
var_dump(g($sxe->p[0]));
var_dump(g($sxe->p[0]));

Expected result:
----------------
string(32) "000000005cddd92f000000002401191e"
string(32) "000000005cddd92f000000002401191e"
string(32) "000000005cddd929000000002401191e"
string(32) "000000005cddd929000000002401191e"
string(32) "000000005cddd929000000002401191e"
string(32) "000000005cddd929000000002401191e"
string(32) "000000005cddd929000000002401191e"
string(32) "000000005cddd929000000002401191e"

Light!

string(2) "2f"
string(2) "2f"
string(2) "29"
string(2) "29"
string(2) "29"
string(2) "29"
string(2) "29"
string(2) "29"

Actual result:
--------------
string(32) "000000005cddd92f000000002401191e"
string(32) "000000005cddd92f000000002401191e"
string(32) "000000005cddd929000000002401191e"
string(32) "000000005cddd92e000000002401191e"
string(32) "000000005cddd92f000000002401191e"
string(32) "000000005cddd929000000002401191e"
string(32) "000000005cddd92e000000002401191e"
string(32) "000000005cddd92f000000002401191e"

Light!

string(2) "2f" // p
string(2) "2f" // p, same hash, oof
string(2) "29" // p[0], ok
string(2) "2e" // p[0], huh?
string(2) "2f" // p[0], has the hash that p… why?
string(2) "29" // p[0], like the first p[0]
string(2) "2e" // p[0], we have a loop here
string(2) "2f" // p[0], definitively, we have a loop.

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-07-27 20:28 UTC] runpac314 at gmail dot com
The hash is partially computed with the object handle the zval is pointing to. 

In the above test script, zvals are destroyed when f() and g() return because 
they are not referenced anywhere else.
When accessing a sxe property through a dim read (->), a new zval with a type 
IS_OBJECT is created and it's given a new object handle. These object handles 
are cycling depending on the availability in the store.

I do not think it's a bug. It's just how SimpleXML gives access to the internal 
libxml nodes to the user.

There are 2 kinds of hash below. One for each variable ($a and $b). Both are 
different objects even if the 
underlying libxml node pointer is the same.

$a = $sxe->p;
$b = $sxe->p[0];
var_dump(f($a));
var_dump(f($a));
var_dump(f($b));
var_dump(f($b));
var_dump(f($b));
var_dump(f($b));
var_dump(f($b));
var_dump(f($b));
 [2010-07-28 11:22 UTC] ivan dot enderlin at hoa-project dot net
That's what I thought actually. But is it a normal behavior for SimpleXML to always create a new zval for each node access? I don't think so.

Is it embarassing to fix it? Does it create some memory problem? The XML tree is created in memory thanks to libxml, thus if SimpleXML stores zval, I don't see any problem.

Your fix is smart and I will use it as a workaround for now. Hope to hear more about it.
 [2010-07-28 14:18 UTC] johannes@php.net
-Status: Open +Status: Bogus
 [2010-07-28 14:18 UTC] johannes@php.net
As runpac314 this is expected and we can't "fix" this in a good way.
 [2010-07-28 15:25 UTC] ivan dot enderlin at hoa-project dot net
Why you can't fix this in a good way? What does it mean :-)?
 [2010-07-28 17:36 UTC] ivan dot enderlin at hoa-project dot net
Ok, I well-understand the problem. So, how can I identify a XML node precisely? Thanks.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Dec 21 13:01:31 2024 UTC