php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #78221 DOMNode::normalize() doesn't remove empty text nodes
Submitted: 2019-06-27 21:15 UTC Modified: -
From: cananian at wikimedia dot org Assigned:
Status: Open Package: DOM XML related
PHP Version: 7.3.6 OS:
Private report: No CVE-ID: None
Have you experienced this issue?
Rate the importance of this bug to you:

 [2019-06-27 21:15 UTC] cananian at wikimedia dot org
Description:
------------
Empty text nodes are supposed to be removed by DOMNode::normalize():
* PHP documentation: "Remove empty text nodes" https://www.php.net/manual/en/domnode.normalize.php
* DOM level 2 spec: "There are neither adjacent Text nodes nor empty Text nodes." https://www.w3.org/TR/DOM-Level-2-Core/core.html#ID-normalize
* Latest WHATWG DOM spec: "If length is zero, then remove node and continue with the next exclusive Text node, if any." https://dom.spec.whatwg.org/#dom-node-normalize

The PHP implementation appears to combine adjacent Text nodes, but does not remove zero-length text nodes.

Test script:
---------------
<?php
$doc = \DOMDocument::loadHTML('<p id=x>foo</p>');
$p = $doc->getElementById('x');
$p->childNodes[0]->textContent = '';
$p->normalize();
# This should print 0.  But it prints 1.
var_dump($p->childNodes->length);


Expected result:
----------------
int(0)

Actual result:
--------------
int(1)

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-09-22 22:51 UTC] beberlei@php.net
Confirmed, code for this case is missing in php_dom.c dom_normalize
 
PHP Copyright © 2001-2019 The PHP Group
All rights reserved.
Last updated: Thu Dec 12 10:01:24 2019 UTC