php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #79701 getElementById does not correctly work with duplicate definitions
Submitted: 2020-06-15 02:17 UTC Modified: 2020-06-15 08:23 UTC
From: beberlei@php.net Assigned:
Status: Verified Package: DOM XML related
PHP Version: 7.2.31 OS:
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: beberlei@php.net
New email:
PHP Version: OS:

 

 [2020-06-15 02:17 UTC] beberlei@php.net
Description:
------------
A DOM id element is supposed to be unique, but sometimes the world is messy. The DOM specification says:

> The getElementById(elementId) method, when invoked, must return the first element,
> in tree order, within this’s descendants, whose ID is elementId, and null if there
> is no such element otherwise. 

PHP uses libxmls ID Map functionality, which does not allow duplicates. The return value of "xmlAddID" is not checked for the error, so elements with duplicate ID don't cause a problem.

However if you remove the first element with an ID, or re-order the elements, then the specifications assumption of returning the first element in tree order does not work anymore.

Test script:
---------------
<?php

$dom = new DOMDocument();
$root = $dom->createElement('html');
$dom->appendChild($root);

$el1 = $dom->createElement('p1');
$el1->setAttribute('id', 'foo');
$el1->setIdAttribute('id', true);

$root->appendChild($el1);

$el2 = $dom->createElement('p2');
$el2->setAttribute('id', 'foo');
$el2->setIdAttribute('id', true);

$root->appendChild($el2);
unset($el1, $el2);

$root->removeChild($dom->getElementById('foo'));

var_dump($dom->getElementById('foo'));


Expected result:
----------------
Returns Element <p2 id="foo" />

Actual result:
--------------
NULL

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2020-06-15 08:23 UTC] cmb@php.net
-Status: Open +Status: Verified
 [2020-06-15 08:23 UTC] cmb@php.net
Confirmed: <https://3v4l.org/Ols6m>
 [2024-05-21 08:46 UTC] dennis6101990leon at outlook dot com
It seems you’ve encountered a known issue when dealing with non-unique IDs within a DOM in PHP. The getElementById method is indeed designed to return the first element with the specified ID. However, when there are duplicates, and the first element is removed, it can lead to unexpected behavior because the ID map does not update as one might expect.

Here’s a workaround for this issue: Instead of relying on getElementById, you can use other methods such as getElementsByTagName or querySelectorAll to retrieve all elements with the same ID and then handle them accordingly. Here’s an example using querySelectorAll: (https://github.com)(https://www.health-insurancemarket.com)

<?php

$dom = new DOMDocument();
$root = $dom->createElement('html');
$dom->appendChild($root);

$el1 = $dom->createElement('p');
$el1->setAttribute('id', 'foo');
$root->appendChild($el1);

$el2 = $dom->createElement('p');
$el2->setAttribute('id', 'foo');
$root->appendChild($el2);

// Use querySelectorAll to get all elements with the same ID
$elements = new DOMXPath($dom);
$results = $elements->query("//*[@id='foo']");

// Remove the first element with the ID 'foo'
$root->removeChild($results->item(0));

// Now, let's check if the second element with ID 'foo' is still accessible
var_dump($results->item(1));

?>

This script will return the second <p> element with the ID ‘foo’, as expected. Remember that having multiple elements with the same ID is not valid HTML, so it’s best to avoid such situations when possible.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue May 21 14:01:33 2024 UTC