php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #76712 Assignment of empty string creates extraneous text node
Submitted: 2018-08-06 14:22 UTC Modified: 2018-08-25 12:25 UTC
From: frank dot mohaupt at brainworx dot audio Assigned: cmb (profile)
Status: Closed Package: SimpleXML related
PHP Version: 7.0.31 OS: Linux d45011cc91b7 4.9.93-boot2d
Private report: No CVE-ID: None
 [2018-08-06 14:22 UTC] frank dot mohaupt at brainworx dot audio
Description:
------------
If the value of node "foo" is empty while adding the node, its output (SimpleXMLElement::asXML()) is "<foo/>".

If the value of node "foo" is overwritten with an empty value ($sxe->foo = ''), its output is "<foo></foo>" instead of "<foo/>".



Test script:
---------------
$sxe = new SimpleXMLElement('<foo></foo>');
$sxe->addChild('bar');
$arrResults['empty node'] = $sxe->asXML();

$sxe = new SimpleXMLElement('<foo></foo>');
$sxe->addChild('bar', '');
$arrResults['empty string'] = $sxe->asXML();

$sxe = new SimpleXMLElement('<foo></foo>');
$sxe->addChild('bar');
$sxe->bar = '';
$arrResults['overwritten empty string'] = $sxe->asXML();

foreach($arrResults as $key => $strResult){
   echo $key . ': ' . $strResult;
}
// output
// empty node: <?xml version="1.0"?>\n<foo><bar/></foo>\n
// empty string: <?xml version="1.0"?>\n<foo><bar/></foo>\n
// overwritten empty string: <?xml version="1.0"?>\n<foo><bar></bar></foo>\n


Expected result:
----------------
I expect to get the same result for all three cases.


Patches

workaround (last revision 2018-08-06 22:08 UTC by cmb@php.net)

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2018-08-06 22:08 UTC] cmb@php.net
The following patch has been added/updated:

Patch Name: workaround
Revision:   1533593319
URL:        https://bugs.php.net/patch-display.php?bug=76712&patch=workaround&revision=1533593319
 [2018-08-06 22:09 UTC] cmb@php.net
While I can confirm the reported behavior[1], I don't think this
qualifies as bug, since both representations are equivalent.  It
rather seems to me that we're hitting a peculiarity of libxml
here, which treats an empty string as `content` differently in
xmlNewChild() and xmlNodeSetContentLen().  In the latter case a
text child node is created; in the former this doesn't happen.  We
could work around this (see the attached patch[2]), but I'm not
convinced that we should.

[1] <https://3v4l.org/mQaLd>
[2] <https://bugs.php.net/patch-display.php?bug=76712&patch=workaround&revision=1533593319>
 [2018-08-06 23:24 UTC] requinix@php.net
The XML spec explicitly says empty elements (no content) can be either a start/end tag pair or a self-closing tag. They are definitely equivalent, though it does later say one "SHOULD" use self-closed tags when the element is defined (like in a spec) to be empty. For the record.

For comparison, in DOMNode
(a) Setting ->textContent uses xmlNodeSetContent("") + xmlNodeAddContent and produces a self-closed element
(b) Setting ->nodeValue uses xmlNodeSetContentLen and produces an element with no content
https://3v4l.org/Of5p1

I see SimpleXML is a library on top of libxml, one that goes beyond being a simple abstraction layer that PHP often does, and libraries regularly do things like normalize behavior from the underlying API. Personally I would go ahead and patch it.
 [2018-08-07 09:44 UTC] cmb@php.net
-Summary: Empty Tags are preserved if node value is overwritten with empty value +Summary: Assignment of empty string creates extraneous text node
 [2018-08-07 09:44 UTC] cmb@php.net
Ah, that's interesting!  Apparently, xmlNodeSetContent() with an
empty content string does not create a text child node, while
xmlNodeSetContentLen() does.  That's pretty inconsistent.  Anyhow,
using xmlNodeSetContent() here would solve the reported issue, and
wouldn't make a difference otherwise, from what I can tell.

I have submitted <https://github.com/php/php-src/pull/3431>.
 [2018-08-25 12:25 UTC] cmb@php.net
Automatic comment on behalf of cmbecker69@gmx.de
Revision: http://git.php.net/?p=php-src.git;a=commit;h=692e5d5c88a939a7c3ce3de61c5fd39effe7c7ae
Log: Fix #76712: Assignment of empty string creates extraneous text node
 [2018-08-25 12:25 UTC] cmb@php.net
-Status: Open +Status: Closed
 [2018-08-25 12:25 UTC] cmb@php.net
-Assigned To: +Assigned To: cmb
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Wed Jan 22 19:01:31 2025 UTC