php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #50156 Empty element should also return END_ELEMENT
Submitted: 2009-11-12 13:48 UTC Modified: 2009-11-25 16:08 UTC
Votes:1
Avg. Score:1.0 ± 0.0
Reproduced:0 of 1 (0.0%)
From: edwin at bitstorm dot org Assigned:
Status: Not a bug Package: XML Reader
PHP Version: 5.2SVN-2009-11-12 (SVN) OS: Ubuntu 9.04
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: edwin at bitstorm dot org
New email:
PHP Version: OS:

 

 [2009-11-12 13:48 UTC] edwin at bitstorm dot org
Description:
------------
Element <a></a> returns twice, one for XMLReader::ELEMENT and one for XMLReader::END_ELEMENT.

Element <a/> returns once, for XMLReader::ELEMENT.

That should return a XMLReader::END_ELEMENT too, because that's implicit.

Problem is that now you can't distinguish between <a><b> and <a/><b> and that's a bug.

Reproduce code:
---------------
  $reader = new XMLReader();
  $reader->open($file);
  echo "<table>\n";
  while ($reader->read()) {
    echo "<tr><td>".$reader->nodeType."</td><td>".$node = $reader->name."</td><td>".$reader->value."</td></tr>\n";
  }
  echo "</table>\n";


Input:

<Titles>
  <Title>
    <ID>429</ID>
    <Type />
    <Barcode>
    </Barcode>


Expected result:
----------------
1	Titles	
14	#text	
1	Title	
14	#text	
1	ID	
3	#text	429
15	ID	
14	#text	
1	Type	
14	#text	
15	Type	
14	#text	
1	Barcode	
14	#text	
15	Barcode
14	#text

Actual result:
--------------
1	Titles	
14	#text	
1	Title	
14	#text	
1	ID	
3	#text	429
15	ID	
14	#text	
1	Type	
14	#text	
1	Barcode	
14	#text	
15	Barcode
14	#text

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2009-11-12 21:02 UTC] edwin at bitstorm dot org
<?php
// Source code for bug #50156
// The following code will output the text below, which
// is not what you expect when you see the xml.
//
// Barcode is not a child of Type, but how can you know?
//
//  Titles
//  Titles
//  Titles - Title
//  Titles - Title
//  Titles - Title - ID
//  Titles - Title - ID
//  Titles - Title
//  Titles - Title
//  Titles - Title - Type
//  Titles - Title - Type
//  Titles - Title - Type - Barcode
//  Titles - Title - Type
//  Titles - Title - Type
//  Titles - Title
//  Titles - Title
//  Titles

$xml = "
<Titles>
  <Title>
    <ID>429</ID>
    <Type/>
    <Barcode></Barcode>
  </Title>
</Titles>
";

$expected = "
Titles
Titles
Titles - Title
Titles - Title
Titles - Title - ID
Titles - Title - ID
Titles - Title
Titles - Title
Titles - Title - Type
Titles - Title
Titles - Title
Titles - Title - Barcode
Titles - Title
Titles - Title
Titles
Titles
";

$reader = new XMLReader();
$reader->xml($xml);
$actual = '';
// Make a stack for every element
$stack = array();
while ($reader->read()) {
  switch($reader->nodeType) {
    case XMLReader::ELEMENT:
    	array_push($stack, $reader->name);
    	break;
    case XMLReader::END_ELEMENT:
      array_pop($stack);
      break;
  }
  $actual .= join(' - ', $stack)."\n";
}

// Clean up and make it OS-agnostic
$expected = preg_replace('/\\r/', '', trim($expected));
$actual = preg_replace('/\\r/', '', trim($actual));

// Print result
echo "<h3>Expected</h3>\n";
echo "<pre>$expected</pre>\n";
echo "<h3>Actual</h3>\n";
echo "<pre>$actual</pre>\n";

// Test it
if ($expected == $actual) {
  echo "<strong>Good</strong>";
} else {
  echo "<strong>Not good</strong>";
}

?>
 [2009-11-12 21:14 UTC] edwin at bitstorm dot org
XML-parser used is libxml 2.7.3 .
 [2009-11-24 16:14 UTC] edwin at bitstorm dot org
Turns out I can use the isEmptyElement-property to find out when dealing with an <a/>-element.

This is a bit unfortunate, because, for example, SAX (the mother of all xml-readers?) does not use this mechnism and works as I would expect.

It's just how libxml seems to work, so it should probably not be marked as a PHP-bug.

This bug can be closed: "not a bug"... :-/

A little parsing example in the documentation might be a very good idea, though.
 [2009-11-25 16:08 UTC] rrichards@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

Behavior is based on the MS C# implementation. END_ELEMENT is not 
generated for empty elements.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Dec 27 13:01:27 2024 UTC