|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #76712 Assignment of empty string creates extraneous text node
Submitted: 2018-08-06 14:22 UTC Modified: 2018-08-25 12:25 UTC
From: frank dot mohaupt at brainworx dot audio Assigned: cmb (profile)
Status: Closed Package: SimpleXML related
PHP Version: 7.0.31 OS: Linux d45011cc91b7 4.9.93-boot2d
Private report: No CVE-ID: None
 [2018-08-06 14:22 UTC] frank dot mohaupt at brainworx dot audio
If the value of node "foo" is empty while adding the node, its output (SimpleXMLElement::asXML()) is "<foo/>".

If the value of node "foo" is overwritten with an empty value ($sxe->foo = ''), its output is "<foo></foo>" instead of "<foo/>".

Test script:
$sxe = new SimpleXMLElement('<foo></foo>');
$arrResults['empty node'] = $sxe->asXML();

$sxe = new SimpleXMLElement('<foo></foo>');
$sxe->addChild('bar', '');
$arrResults['empty string'] = $sxe->asXML();

$sxe = new SimpleXMLElement('<foo></foo>');
$sxe->bar = '';
$arrResults['overwritten empty string'] = $sxe->asXML();

foreach($arrResults as $key => $strResult){
   echo $key . ': ' . $strResult;
// output
// empty node: <?xml version="1.0"?>\n<foo><bar/></foo>\n
// empty string: <?xml version="1.0"?>\n<foo><bar/></foo>\n
// overwritten empty string: <?xml version="1.0"?>\n<foo><bar></bar></foo>\n

Expected result:
I expect to get the same result for all three cases.


workaround (last revision 2018-08-06 22:08 UTC by

Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2018-08-06 22:08 UTC]
The following patch has been added/updated:

Patch Name: workaround
Revision:   1533593319
 [2018-08-06 22:09 UTC]
While I can confirm the reported behavior[1], I don't think this
qualifies as bug, since both representations are equivalent.  It
rather seems to me that we're hitting a peculiarity of libxml
here, which treats an empty string as `content` differently in
xmlNewChild() and xmlNodeSetContentLen().  In the latter case a
text child node is created; in the former this doesn't happen.  We
could work around this (see the attached patch[2]), but I'm not
convinced that we should.

[1] <>
[2] <>
 [2018-08-06 23:24 UTC]
The XML spec explicitly says empty elements (no content) can be either a start/end tag pair or a self-closing tag. They are definitely equivalent, though it does later say one "SHOULD" use self-closed tags when the element is defined (like in a spec) to be empty. For the record.

For comparison, in DOMNode
(a) Setting ->textContent uses xmlNodeSetContent("") + xmlNodeAddContent and produces a self-closed element
(b) Setting ->nodeValue uses xmlNodeSetContentLen and produces an element with no content

I see SimpleXML is a library on top of libxml, one that goes beyond being a simple abstraction layer that PHP often does, and libraries regularly do things like normalize behavior from the underlying API. Personally I would go ahead and patch it.
 [2018-08-07 09:44 UTC]
-Summary: Empty Tags are preserved if node value is overwritten with empty value +Summary: Assignment of empty string creates extraneous text node
 [2018-08-07 09:44 UTC]
Ah, that's interesting!  Apparently, xmlNodeSetContent() with an
empty content string does not create a text child node, while
xmlNodeSetContentLen() does.  That's pretty inconsistent.  Anyhow,
using xmlNodeSetContent() here would solve the reported issue, and
wouldn't make a difference otherwise, from what I can tell.

I have submitted <>.
 [2018-08-25 12:25 UTC]
Automatic comment on behalf of
Log: Fix #76712: Assignment of empty string creates extraneous text node
 [2018-08-25 12:25 UTC]
-Status: Open +Status: Closed
 [2018-08-25 12:25 UTC]
-Assigned To: +Assigned To: cmb
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Jul 13 13:01:29 2024 UTC