php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #26650 Decimal entities
Submitted: 2003-12-17 10:18 UTC Modified: 2003-12-22 08:44 UTC
From: msw at seebi dot de Assigned:
Status: Not a bug Package: DOM XML related
PHP Version: 4.3.4 OS: Win XP Prof
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: msw at seebi dot de
New email:
PHP Version: OS:

 

 [2003-12-17 10:18 UTC] msw at seebi dot de
Description:
------------
When I'm producing a new xml file (domxml_new_doc())  which contains some decimal entities (e.g. for the german characters ß Ä ö and so on) and saving this file to disk (dump_file()) each character '&' in each decimal entity is converted into &amp. This appears with PHP 4.3.0 and 4.3.4 (Win). PHP 4.3.1, 4.3.2 and 4.3.3 not tested. 

Reproduce code:
---------------
<?php


$content="&#223; - &#196;";

$dom = domxml_new_doc("1.0");
$root = $dom->add_root("list");

 $ab=$dom->create_element("absatz");
 $ab->set_attribute("id","10");
 $text=$dom->create_element("text");
 $content=$dom->create_text_node($content);
 $text->append_child($content);
 $ab->append_child($text);
 $root->append_child($ab);

$dom->dump_file("test.xml", false, false);


?>


Expected result:
----------------
the file test.xml should look like this:

<?xml version="1.0"?>
<list><absatz id="10"><text>&#223; -&#196;</text></absatz></list>

Actual result:
--------------
The actual result is:

<?xml version="1.0"?>
<list><absatz id="10"><text>&amp;#223; -&amp;#196;</text></absatz></list>

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2003-12-22 08:18 UTC] msw at seebi dot de
As you suggested, I've checked out the new stable release. The result is the same, each character '&' is converted into &amp.
 [2003-12-22 08:44 UTC] moriyoshi@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

Any kinds of characters like "&" that have special meanings in the XML document will be escaped during the serialization. There's no such specification that defines entity-like strings to be handled specially.

If you want to use strings that contain a special character which cannot be represented by the encoding your script uses, or which cannot be entered via your keyboard, you need to convert the string to the appropriate encoding (UTF-8) first and then pass it to the DOM facility, as DOM extension assumes all passed strings are encoded in UTF-8 internally.
 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Wed Nov 25 13:01:25 2020 UTC