php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #36379 SVG XML parsing looses child tags and attributes
Submitted: 2006-02-13 15:24 UTC Modified: 2008-09-14 01:00 UTC
Votes:8
Avg. Score:3.8 ± 1.4
Reproduced:6 of 7 (85.7%)
Same Version:4 (66.7%)
Same OS:2 (33.3%)
From: rele at gmx dot de Assigned:
Status: No Feedback Package: SimpleXML related
PHP Version: 5.2.4 OS: Windows XP SP2
Private report: No CVE-ID: None
 [2006-02-13 15:24 UTC] rele at gmx dot de
Description:
------------
I want to parse SVG XML code with SimpleXML, but under certain circumstances the parsed SimpleXMLElements do not contain either attributes or child tags.

Reproduce code:
---------------
$test_simplexml_errors_svg = <<<EOD
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"[
  <!ENTITY E1 'font-size:7pt'>
]>
<svg id="svg_output" version="1.1" xmlns="http://www.w3.org/2000/svg" viewBox="-2 -7 1561 708">
  <text id="1a" style="&E1;" x="300" y="100">ParentText <tspan id="2a" dy="12" x="310">ChildText1</tspan><tspan id="3a" dy="12" x="350"> </tspan></text>
  <text id="1b" style="&E1;" x="400" y="200"><tspan id="2b" dy="12" x="410">ChildText1</tspan><tspan id="3b" dy="12" x="450"> </tspan></text>
  <text id="1c" style="&E1;" x="500" y="300"><tspan id="2c" dy="12" x="510">ChildText1</tspan><tspan id="3c" dy="12" x="550">ChildText2</tspan></text>
</svg>
EOD;
print_r(simplexml_load_string($test_simplexml_errors_svg));


Expected result:
----------------
SimpleXMLElement Object
(
    [@attributes] => Array
        (
            [id] => svg_output
            [version] => 1.1
            [viewBox] => -2 -7 1561 708
        )

    [text] => Array
        (
            [0] => SimpleXMLElement Object
                (
                    [@attributes] => Array
                        (
                            [id] => 1a
                            [style] => font-size:7pt
                            [x] => 300
                            [y] => 100
                        )

                    [0] => ParentText

                )

            [1] => SimpleXMLElement Object
                (
                    [@attributes] => Array
                        (
                            [id] => 1b
                            [style] => font-size:7pt
                            [x] => 400
                            [y] => 200
                        )

                    [tspan] => Array
                        (
                            [0] => SimpleXMLElement Object
                                (
                                    [@attributes] => Array
                                        (
                                            [id] => 2b
                                            [dy] => 12
                                            [x] => 410
                                        )

                                    [0] => ChildText1
                                )

                            [1] => SimpleXMLElement Object
                                (
                                    [@attributes] => Array
                                        (
                                            [id] => 3b
                                            [dy] => 12
                                            [x] => 450
                                        )

                                    [0] =>
                                )

                        )

                )

            [2] => SimpleXMLElement Object
                (
                    [@attributes] => Array
                        (
                            [id] => 1c
                            [style] => font-size:7pt
                            [x] => 500
                            [y] => 300
                        )

                    [tspan] => Array
                        (
                            [0] => SimpleXMLElement Object
                                (
                                    [@attributes] => Array
                                        (
                                            [id] => 2c
                                            [dy] => 12
                                            [x] => 510
                                        )

                                    [0] => ChildText1
                                )

                            [1] => SimpleXMLElement Object
                                (
                                    [@attributes] => Array
                                        (
                                            [id] => 3c
                                            [dy] => 12
                                            [x] => 550
                                        )

                                    [0] => ChildText2
                                )
                        )

                )

        )

)

Actual result:
--------------
SimpleXMLElement Object
(
    [@attributes] => Array
        (
            [id] => svg_output
            [version] => 1.1
            [viewBox] => -2 -7 1561 708
        )

    [text] => Array
        (
            [0] => ParentText 
            [1] => SimpleXMLElement Object
                (
                    [@attributes] => Array
                        (
                            [id] => 1b
                            [style] => font-size:7pt
                            [x] => 400
                            [y] => 200
                        )

                    [tspan] => Array
                        (
                            [0] => ChildText1
                            [1] => SimpleXMLElement Object
                                (
                                    [@attributes] => Array
                                        (
                                            [id] => 3b
                                            [dy] => 12
                                            [x] => 450
                                        )

                                    [0] =>  
                                )

                        )

                )

            [2] => SimpleXMLElement Object
                (
                    [@attributes] => Array
                        (
                            [id] => 1c
                            [style] => font-size:7pt
                            [x] => 500
                            [y] => 300
                        )

                    [tspan] => Array
                        (
                            [0] => ChildText1
                            [1] => ChildText2
                        )

                )

        )

)

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-02-15 06:55 UTC] chregu@php.net
print_r (and var_dump) isn't a reliable method for checking, 
what's "inside" a simplexml object.

Please check with an iterator, if there's really something 
missing and provide a reproducable script.
 [2006-05-10 20:11 UTC] rele at gmx dot de
Description:
------------
Maybe it will be clearer if I use type casting.
As you can see in the result output, there is no array index 0 in each parent which should store the text value of the first child, like there are for all childs.
Especially not for parent 1a "ParentText " and 1c " ".

The attributes are fine, either by direct access $parent['style'] or by using $parent->attributes().

But $parent[0] is always NULL.


Reproduce code:
---------------
$test_simplexml_errors_svg = <<<EOD
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"[
  <!ENTITY E1 'font-size:7pt'>
]>
<svg id="svg_output" version="1.1" xmlns="http://www.w3.org/2000/svg"
viewBox="-2 -7 1561 708">
  <text id="1a" style="&E1;" x="300" y="100">ParentText <tspan id="2a"
dy="12" x="310">ChildText1</tspan><tspan id="3a" dy="12" x="350">
</tspan></text>
  <text id="1b" style="&E1;" x="400" y="200"><tspan id="2b" dy="12"
x="410">ChildText1</tspan><tspan id="3b" dy="12" x="450">
</tspan></text>
  <text id="1c" style="&E1;" x="500" y="300"> <tspan id="2c" dy="12"
x="510">ChildText1</tspan><tspan id="3c" dy="12"
x="550">ChildText2</tspan></text>
</svg>
EOD;

$simplexml = simplexml_load_string($test_simplexml_errors_svg);
foreach($simplexml as &$parent) {
  echo 'parent ', $parent->getName(), ' ', $parent['id'], ' = "', (string) $parent, '"', "\n";
  print_r($parent);
  foreach($parent as $child) {
    echo ' child ', $child->getName(), ' ', $child['id'], ' = "', (string) $child, '"', "\n";
    print_r($child);
  }
}

Actual result:
--------------
parent text 1a = "ParentText "
SimpleXMLElement Object
(
    [@attributes] => Array
        (
            [id] => 1a
            [style] => font-size:7pt
            [x] => 300
            [y] => 100
        )
    [tspan] => Array
        (
            [0] => ChildText1
            [1] => SimpleXMLElement Object
                (
                    [@attributes] => Array
                        (
                            [id] => 3a
                            [dy] => 12
                            [x] => 350
                        )
                    [0] => 
                )
        )
)
 child tspan 2a = "ChildText1"
SimpleXMLElement Object
(
    [@attributes] => Array
        (
            [id] => 2a
            [dy] => 12
            [x] => 310
        )
    [0] => ChildText1
)
 child tspan 3a = "
"
SimpleXMLElement Object
(
    [@attributes] => Array
        (
            [id] => 3a
            [dy] => 12
            [x] => 350
        )
    [0] => 
)
parent text 1b = ""
SimpleXMLElement Object
(
    [@attributes] => Array
        (
            [id] => 1b
            [style] => font-size:7pt
            [x] => 400
            [y] => 200
        )
    [tspan] => Array
        (
            [0] => ChildText1
            [1] => SimpleXMLElement Object
                (
                    [@attributes] => Array
                        (
                            [id] => 3b
                            [dy] => 12
                            [x] => 450
                        )
                    [0] => 
                )
        )
)
 child tspan 2b = "ChildText1"
SimpleXMLElement Object
(
    [@attributes] => Array
        (
            [id] => 2b
            [dy] => 12
            [x] => 410
        )
    [0] => ChildText1
)
 child tspan 3b = "
"
SimpleXMLElement Object
(
    [@attributes] => Array
        (
            [id] => 3b
            [dy] => 12
            [x] => 450
        )
    [0] => 
)
parent text 1c = " "
SimpleXMLElement Object
(
    [@attributes] => Array
        (
            [id] => 1c
            [style] => font-size:7pt
            [x] => 500
            [y] => 300
        )
    [tspan] => Array
        (
            [0] => ChildText1
            [1] => ChildText2
        )
)
 child tspan 2c = "ChildText1"
SimpleXMLElement Object
(
    [@attributes] => Array
        (
            [id] => 2c
            [dy] => 12
            [x] => 510
        )
    [0] => ChildText1
)
 child tspan 3c = "ChildText2"
SimpleXMLElement Object
(
    [@attributes] => Array
        (
            [id] => 3c
            [dy] => 12
            [x] => 550
        )
    [0] => ChildText2
)
 [2006-07-27 02:00 UTC] sniper@php.net
Please try using this CVS snapshot:

  http://snaps.php.net/php5.2-latest.tar.gz
 
For Windows:
 
  http://snaps.php.net/win32/php5.2-win32-latest.zip


 [2006-07-27 05:44 UTC] rele at gmx dot de
I tried the code with the 5.2.0RC2-dev Win32 build (Date => Jul 27 2006 04:15:19), but it produced the same result, and $parent[0] was always NULL, too.


In addition I had to modify the second example code from 
  foreach($simplexml as &$parent) {
to
  foreach($simplexml as $parent) {
because it produced:
  PHP Fatal error: An iterator cannot be used with foreach by reference
 [2007-06-26 06:53 UTC] rele at gmx dot de
I tried the snapshot with Windows XP SP2 with my 2 examples and I am getting exactly the same output, so nothing seems to have changed.
 [2007-08-20 12:03 UTC] jani@php.net
You're still trying to do print_r/var_dump when you were asked to test using an iterator. Please fix your code. (And please, try make it a lot shorter..it's very hard to read now)
 [2007-09-13 10:44 UTC] rele at gmx dot de
Here is a simplified version, which only shows the missing child nodes (so I will check attributes again once this is working).
XPath seems to be working.

The dynamic $parent->$key[$i] access is only working for $i=0, otherwise "Notice: Uninitialized string offset" occurs.


Reproduce code:
---------------

$test_simplexml_errors = <<<EOD
<?xml version="1.0"?>
<t>
  <p>PARENT<c>CHILD</c></p>
  <p><c>CHILD1</c><c> </c></p>
  <p> <c>CHILD1</c><c/></p>
  <p><c><d/><d/></c><c/></p>
</t>
EOD;

$simplexml = simplexml_load_string($test_simplexml_errors);

// print_r($simplexml);

foreach($simplexml as $parent) {
  echo $parent->asXML(), "\n"; //text node: '", (string) $parent, "'\n";

  if( ((string) $parent) && ! array_key_exists(0, $parent))
    echo "ERROR: No text child node with index 0 found\n";

  $children = $parent->children();
  $key = $children[0]->getName();
  if(! array_key_exists($key, $parent))
    echo "ERROR: No child node with key '$key' found\n";

  $child_count = count($parent->xpath($key));
  for($i = 0; $i < $child_count; $i++) {
    if(! array_key_exists($i, $parent->$key) && ($child_count != 1 || (string) $parent->$key) )
      echo "ERROR: No child node with index $i found\n";
      // echo "child $i direct: ", $parent->c[$i]->asXML(),"\n";
      // echo "child $i dynamic: ", $parent->$key[$i]->asXML(),"\n";
  }

  foreach($parent as $child) {
    // echo ' child: ', $child->asXML(), "\n child text node: '", (string) $child, "'\n";

    if($child->xpath('text()') && ! array_key_exists(0, $child)) {
      echo "ERROR: No text child element with index 0 found\n";
      // echo $child->asXML(), "\n";
    }

    $child_count = count($child->xpath('./*'));
    if($child_count) {
      $children = $child->children();
      $key = $children[0]->getName();
      for($i = 0; $i < $child_count; $i++) {
        if(! array_key_exists($i, $child->$key) && ($child_count != 1 || (string) $child->$key) )
          echo "ERROR: No sub child node with index $i found\n";
          // echo "sub child $i direct: ", $child->d[$i]->asXML(),"\n";
          // echo "sub child $i variable: ", $child->$key[$i]->asXML(),"\n";
      }
    }
  }
}
 [2008-09-06 16:01 UTC] jani@php.net
Once more: You need to provide a _short_ example script. Which clearly demonstrates the issue. For all I can tell, it's just your script that is buggy here.
 [2008-09-14 01:00 UTC] php-bugs at lists dot php dot net
No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun Nov 10 00:01:28 2024 UTC