php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #39973 Handling of & in XML file. Running php 5.1.2
Submitted: 2006-12-28 07:17 UTC Modified: 2006-12-29 10:10 UTC
From: l dot chemwolo at heinosoft dot eu Assigned:
Status: Not a bug Package: XML related
PHP Version: 5CVS-2006-12-28 (snap) OS: Ubuntu
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: l dot chemwolo at heinosoft dot eu
New email:
PHP Version: OS:

 

 [2006-12-28 07:17 UTC] l dot chemwolo at heinosoft dot eu
Description:
------------
When I parse an xml file containing <builder>Bill &amp; Joseph Cook</builder> I get Joseph Cook as the value between these tags. It drops anything that comes before &amp;. 

I am using php 5.1.2. I could see this on the section "PHP version" above so I just picked one of the listed.

Reproduce code:
---------------
I am using a class:
class CluistraParser extends XML_Parser{
...
}

to do the parsing based on the parser.php file.

Expected result:
----------------
To get 'Bill & Joseph Cook' as the value for the tag <builder>.

Actual result:
--------------
I get 'Joseph Cook'. Anything before &amp; is dropped and the ampersand does not appear.

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-12-28 07:23 UTC] l dot chemwolo at heinosoft dot eu
I meant I could not see version 5.1.2 on the section "PHP version" so I just picked one of the listed when submitting bug report.
 [2006-12-28 07:24 UTC] l dot chemwolo at heinosoft dot eu
I meant I could not see version 5.1.2 on the section "PHP version" so I just picked one of the listed when submitting bug report.
 [2006-12-28 08:00 UTC] l dot chemwolo at heinosoft dot eu
I saw an almost similar bug reported by someone for php 5.0.3. The report is http://bugs.php.net/bug.php?id=31139&edit=2.
 [2006-12-28 10:03 UTC] tony2001@php.net
Thank you for this bug report. To properly diagnose the problem, we
need a short but complete example script to be able to reproduce
this bug ourselves. 

A proper reproducing script starts with <?php and ends with ?>,
is max. 10-20 lines long and does not require any external 
resources such as databases, etc. If the script requires a 
database to demonstrate the issue, please make sure it creates 
all necessary tables, stored procedures etc.

Please avoid embedding huge scripts into the report.


 [2006-12-28 12:33 UTC] l dot chemwolo at heinosoft dot eu
Reproduce code:
---------------
<?php
require_once 'Parser.php';
class CluistraParser extends XML_Parser{
function cdataHandler($parser, $data)
{
$data=trim($data);
 echo $data."<br/>";
}
}
 $cluistra=new CluistraParser();
 $cluistra->setInputFile("afile.xml");
  $success = $cluistra->parse();
  if (PEAR::isError($success)) {
    die('Parsing failed: '.$success->getMessage());
  }
/*afile.xml has:
    <?xml version='1.0' encoding='ISO-8859-1'?>
    <builder>Bill &amp; John Keen</builder>*/
 ?>
 [2006-12-28 12:41 UTC] l dot chemwolo at heinosoft dot eu
It gives the following as the result:

Bill
&
John Keen

It seems to take the tag <builder>Bill &amp; John Keen</builder> to be containing three cdata sections within it. Therefore the '&' overwrites the 'Bill' and the 'John Keen' overwrites the 'Bill' and finally I have 'John Keen' as the cdata value for this tag; making it seem like it dropped the ampersand and anything before it.
 [2006-12-28 12:44 UTC] tony2001@php.net
Please "a short but complete example script max. 10-20 lines long" which does not include non-existing files etc.
 [2006-12-28 20:56 UTC] rrichards@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

char data can be chunked so do not expect it all at once
 [2006-12-29 10:10 UTC] l dot chemwolo at heinosoft dot eu
Thanks for the hint. I will try and find the solution.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Tue Jul 01 20:01:36 2025 UTC