php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #50156 Empty element should also return END_ELEMENT
Submitted: 2009-11-12 13:48 UTC Modified: 2009-11-25 16:08 UTC
Votes:1
Avg. Score:1.0 ± 0.0
Reproduced:0 of 1 (0.0%)
From: edwin at bitstorm dot org Assigned:
Status: Not a bug Package: XML Reader
PHP Version: 5.2SVN-2009-11-12 (SVN) OS: Ubuntu 9.04
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: edwin at bitstorm dot org
New email:
PHP Version: OS:

 

 [2009-11-12 13:48 UTC] edwin at bitstorm dot org
Description:
------------
Element <a></a> returns twice, one for XMLReader::ELEMENT and one for XMLReader::END_ELEMENT.

Element <a/> returns once, for XMLReader::ELEMENT.

That should return a XMLReader::END_ELEMENT too, because that's implicit.

Problem is that now you can't distinguish between <a><b> and <a/><b> and that's a bug.

Reproduce code:
---------------
  $reader = new XMLReader();
  $reader->open($file);
  echo "<table>\n";
  while ($reader->read()) {
    echo "<tr><td>".$reader->nodeType."</td><td>".$node = $reader->name."</td><td>".$reader->value."</td></tr>\n";
  }
  echo "</table>\n";


Input:

<Titles>
  <Title>
    <ID>429</ID>
    <Type />
    <Barcode>
    </Barcode>


Expected result:
----------------
1	Titles	
14	#text	
1	Title	
14	#text	
1	ID	
3	#text	429
15	ID	
14	#text	
1	Type	
14	#text	
15	Type	
14	#text	
1	Barcode	
14	#text	
15	Barcode
14	#text

Actual result:
--------------
1	Titles	
14	#text	
1	Title	
14	#text	
1	ID	
3	#text	429
15	ID	
14	#text	
1	Type	
14	#text	
1	Barcode	
14	#text	
15	Barcode
14	#text

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-11-12 21:02 UTC] edwin at bitstorm dot org
<?php
// Source code for bug #50156
// The following code will output the text below, which
// is not what you expect when you see the xml.
//
// Barcode is not a child of Type, but how can you know?
//
//  Titles
//  Titles
//  Titles - Title
//  Titles - Title
//  Titles - Title - ID
//  Titles - Title - ID
//  Titles - Title
//  Titles - Title
//  Titles - Title - Type
//  Titles - Title - Type
//  Titles - Title - Type - Barcode
//  Titles - Title - Type
//  Titles - Title - Type
//  Titles - Title
//  Titles - Title
//  Titles

$xml = "
<Titles>
  <Title>
    <ID>429</ID>
    <Type/>
    <Barcode></Barcode>
  </Title>
</Titles>
";

$expected = "
Titles
Titles
Titles - Title
Titles - Title
Titles - Title - ID
Titles - Title - ID
Titles - Title
Titles - Title
Titles - Title - Type
Titles - Title
Titles - Title
Titles - Title - Barcode
Titles - Title
Titles - Title
Titles
Titles
";

$reader = new XMLReader();
$reader->xml($xml);
$actual = '';
// Make a stack for every element
$stack = array();
while ($reader->read()) {
  switch($reader->nodeType) {
    case XMLReader::ELEMENT:
    	array_push($stack, $reader->name);
    	break;
    case XMLReader::END_ELEMENT:
      array_pop($stack);
      break;
  }
  $actual .= join(' - ', $stack)."\n";
}

// Clean up and make it OS-agnostic
$expected = preg_replace('/\\r/', '', trim($expected));
$actual = preg_replace('/\\r/', '', trim($actual));

// Print result
echo "<h3>Expected</h3>\n";
echo "<pre>$expected</pre>\n";
echo "<h3>Actual</h3>\n";
echo "<pre>$actual</pre>\n";

// Test it
if ($expected == $actual) {
  echo "<strong>Good</strong>";
} else {
  echo "<strong>Not good</strong>";
}

?>
 [2009-11-12 21:14 UTC] edwin at bitstorm dot org
XML-parser used is libxml 2.7.3 .
 [2009-11-24 16:14 UTC] edwin at bitstorm dot org
Turns out I can use the isEmptyElement-property to find out when dealing with an <a/>-element.

This is a bit unfortunate, because, for example, SAX (the mother of all xml-readers?) does not use this mechnism and works as I would expect.

It's just how libxml seems to work, so it should probably not be marked as a PHP-bug.

This bug can be closed: "not a bug"... :-/

A little parsing example in the documentation might be a very good idea, though.
 [2009-11-25 16:08 UTC] rrichards@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

Behavior is based on the MS C# implementation. END_ELEMENT is not 
generated for empty elements.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 19 10:01:28 2024 UTC