php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #76712 Assignment of empty string creates extraneous text node
Submitted: 2018-08-06 14:22 UTC Modified: 2018-08-25 12:25 UTC
From: frank dot mohaupt at brainworx dot audio Assigned: cmb (profile)
Status: Closed Package: SimpleXML related
PHP Version: 7.0.31 OS: Linux d45011cc91b7 4.9.93-boot2d
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: frank dot mohaupt at brainworx dot audio
New email:
PHP Version: OS:

 

 [2018-08-06 14:22 UTC] frank dot mohaupt at brainworx dot audio
Description:
------------
If the value of node "foo" is empty while adding the node, its output (SimpleXMLElement::asXML()) is "<foo/>".

If the value of node "foo" is overwritten with an empty value ($sxe->foo = ''), its output is "<foo></foo>" instead of "<foo/>".



Test script:
---------------
$sxe = new SimpleXMLElement('<foo></foo>');
$sxe->addChild('bar');
$arrResults['empty node'] = $sxe->asXML();

$sxe = new SimpleXMLElement('<foo></foo>');
$sxe->addChild('bar', '');
$arrResults['empty string'] = $sxe->asXML();

$sxe = new SimpleXMLElement('<foo></foo>');
$sxe->addChild('bar');
$sxe->bar = '';
$arrResults['overwritten empty string'] = $sxe->asXML();

foreach($arrResults as $key => $strResult){
   echo $key . ': ' . $strResult;
}
// output
// empty node: <?xml version="1.0"?>\n<foo><bar/></foo>\n
// empty string: <?xml version="1.0"?>\n<foo><bar/></foo>\n
// overwritten empty string: <?xml version="1.0"?>\n<foo><bar></bar></foo>\n


Expected result:
----------------
I expect to get the same result for all three cases.


Patches

workaround (last revision 2018-08-06 22:08 UTC by cmb@php.net)

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2018-08-06 22:08 UTC] cmb@php.net
The following patch has been added/updated:

Patch Name: workaround
Revision:   1533593319
URL:        https://bugs.php.net/patch-display.php?bug=76712&patch=workaround&revision=1533593319
 [2018-08-06 22:09 UTC] cmb@php.net
While I can confirm the reported behavior[1], I don't think this
qualifies as bug, since both representations are equivalent.  It
rather seems to me that we're hitting a peculiarity of libxml
here, which treats an empty string as `content` differently in
xmlNewChild() and xmlNodeSetContentLen().  In the latter case a
text child node is created; in the former this doesn't happen.  We
could work around this (see the attached patch[2]), but I'm not
convinced that we should.

[1] <https://3v4l.org/mQaLd>
[2] <https://bugs.php.net/patch-display.php?bug=76712&patch=workaround&revision=1533593319>
 [2018-08-06 23:24 UTC] requinix@php.net
The XML spec explicitly says empty elements (no content) can be either a start/end tag pair or a self-closing tag. They are definitely equivalent, though it does later say one "SHOULD" use self-closed tags when the element is defined (like in a spec) to be empty. For the record.

For comparison, in DOMNode
(a) Setting ->textContent uses xmlNodeSetContent("") + xmlNodeAddContent and produces a self-closed element
(b) Setting ->nodeValue uses xmlNodeSetContentLen and produces an element with no content
https://3v4l.org/Of5p1

I see SimpleXML is a library on top of libxml, one that goes beyond being a simple abstraction layer that PHP often does, and libraries regularly do things like normalize behavior from the underlying API. Personally I would go ahead and patch it.
 [2018-08-07 09:44 UTC] cmb@php.net
-Summary: Empty Tags are preserved if node value is overwritten with empty value +Summary: Assignment of empty string creates extraneous text node
 [2018-08-07 09:44 UTC] cmb@php.net
Ah, that's interesting!  Apparently, xmlNodeSetContent() with an
empty content string does not create a text child node, while
xmlNodeSetContentLen() does.  That's pretty inconsistent.  Anyhow,
using xmlNodeSetContent() here would solve the reported issue, and
wouldn't make a difference otherwise, from what I can tell.

I have submitted <https://github.com/php/php-src/pull/3431>.
 [2018-08-25 12:25 UTC] cmb@php.net
Automatic comment on behalf of cmbecker69@gmx.de
Revision: http://git.php.net/?p=php-src.git;a=commit;h=692e5d5c88a939a7c3ce3de61c5fd39effe7c7ae
Log: Fix #76712: Assignment of empty string creates extraneous text node
 [2018-08-25 12:25 UTC] cmb@php.net
-Status: Open +Status: Closed
 [2018-08-25 12:25 UTC] cmb@php.net
-Assigned To: +Assigned To: cmb
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 13:01:29 2024 UTC