php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #79701 getElementById does not correctly work with duplicate definitions
Submitted: 2020-06-15 02:17 UTC Modified: 2020-06-15 08:23 UTC
From: beberlei@php.net Assigned:
Status: Verified Package: DOM XML related
PHP Version: 7.2.31 OS:
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please — but make sure to vote on the bug!
Your email address:
MUST BE VALID
Solve the problem:
26 + 8 = ?
Subscribe to this entry?

 
 [2020-06-15 02:17 UTC] beberlei@php.net
Description:
------------
A DOM id element is supposed to be unique, but sometimes the world is messy. The DOM specification says:

> The getElementById(elementId) method, when invoked, must return the first element,
> in tree order, within this’s descendants, whose ID is elementId, and null if there
> is no such element otherwise. 

PHP uses libxmls ID Map functionality, which does not allow duplicates. The return value of "xmlAddID" is not checked for the error, so elements with duplicate ID don't cause a problem.

However if you remove the first element with an ID, or re-order the elements, then the specifications assumption of returning the first element in tree order does not work anymore.

Test script:
---------------
<?php

$dom = new DOMDocument();
$root = $dom->createElement('html');
$dom->appendChild($root);

$el1 = $dom->createElement('p1');
$el1->setAttribute('id', 'foo');
$el1->setIdAttribute('id', true);

$root->appendChild($el1);

$el2 = $dom->createElement('p2');
$el2->setAttribute('id', 'foo');
$el2->setIdAttribute('id', true);

$root->appendChild($el2);
unset($el1, $el2);

$root->removeChild($dom->getElementById('foo'));

var_dump($dom->getElementById('foo'));


Expected result:
----------------
Returns Element <p2 id="foo" />

Actual result:
--------------
NULL

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-06-15 08:23 UTC] cmb@php.net
-Status: Open +Status: Verified
 [2020-06-15 08:23 UTC] cmb@php.net
Confirmed: <https://3v4l.org/Ols6m>
 [2024-05-21 08:46 UTC] dennis6101990leon at outlook dot com
It seems you’ve encountered a known issue when dealing with non-unique IDs within a DOM in PHP. The getElementById method is indeed designed to return the first element with the specified ID. However, when there are duplicates, and the first element is removed, it can lead to unexpected behavior because the ID map does not update as one might expect.

Here’s a workaround for this issue: Instead of relying on getElementById, you can use other methods such as getElementsByTagName or querySelectorAll to retrieve all elements with the same ID and then handle them accordingly. Here’s an example using querySelectorAll: (https://github.com)(https://www.health-insurancemarket.com)

<?php

$dom = new DOMDocument();
$root = $dom->createElement('html');
$dom->appendChild($root);

$el1 = $dom->createElement('p');
$el1->setAttribute('id', 'foo');
$root->appendChild($el1);

$el2 = $dom->createElement('p');
$el2->setAttribute('id', 'foo');
$root->appendChild($el2);

// Use querySelectorAll to get all elements with the same ID
$elements = new DOMXPath($dom);
$results = $elements->query("//*[@id='foo']");

// Remove the first element with the ID 'foo'
$root->removeChild($results->item(0));

// Now, let's check if the second element with ID 'foo' is still accessible
var_dump($results->item(1));

?>

This script will return the second <p> element with the ID ‘foo’, as expected. Remember that having multiple elements with the same ID is not valid HTML, so it’s best to avoid such situations when possible.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Wed May 22 02:01:33 2024 UTC