php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #78221 DOMNode::normalize() doesn't remove empty text nodes
Submitted: 2019-06-27 21:15 UTC Modified: -
From: cananian at wikimedia dot org Assigned:
Status: Closed Package: DOM XML related
PHP Version: 7.3.6 OS:
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: cananian at wikimedia dot org
New email:
PHP Version: OS:

 

 [2019-06-27 21:15 UTC] cananian at wikimedia dot org
Description:
------------
Empty text nodes are supposed to be removed by DOMNode::normalize():
* PHP documentation: "Remove empty text nodes" https://www.php.net/manual/en/domnode.normalize.php
* DOM level 2 spec: "There are neither adjacent Text nodes nor empty Text nodes." https://www.w3.org/TR/DOM-Level-2-Core/core.html#ID-normalize
* Latest WHATWG DOM spec: "If length is zero, then remove node and continue with the next exclusive Text node, if any." https://dom.spec.whatwg.org/#dom-node-normalize

The PHP implementation appears to combine adjacent Text nodes, but does not remove zero-length text nodes.

Test script:
---------------
<?php
$doc = \DOMDocument::loadHTML('<p id=x>foo</p>');
$p = $doc->getElementById('x');
$p->childNodes[0]->textContent = '';
$p->normalize();
# This should print 0.  But it prints 1.
var_dump($p->childNodes->length);


Expected result:
----------------
int(0)

Actual result:
--------------
int(1)

Patches

Pull Requests

Pull requests:

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-09-22 22:51 UTC] beberlei@php.net
Confirmed, code for this case is missing in php_dom.c dom_normalize
 [2020-03-11 12:09 UTC] cmb@php.net
The following pull request has been associated:

Patch Name: Fix #78221: DOMNode::normalize() doesn't remove empty text nodes
On GitHub:  https://github.com/php/php-src/pull/5254
Patch:      https://github.com/php/php-src/pull/5254.patch
 [2020-04-07 11:10 UTC] cmb@php.net
Automatic comment on behalf of cmbecker69@gmx.de
Revision: http://git.php.net/?p=php-src.git;a=commit;h=efec22b7bedfb1eae2df72b84cf5ad229e0bdc1e
Log: Fix #78221: DOMNode::normalize() doesn't remove empty text nodes
 [2020-04-07 11:10 UTC] cmb@php.net
-Status: Open +Status: Closed
 [2020-06-05 15:03 UTC] divinity76 at gmail dot com
i'm not entirely sure this should have been fixed in a minor release, this might break some HTML DOM traversing code in the wild like ->nextSibling->nextSibling->blah, and thus is a breaking change, isn't it?
 [2020-06-05 15:08 UTC] divinity76 at gmail dot com
s/minor release/patch release
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Oct 27 16:01:27 2024 UTC