php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #26650 Decimal entities
Submitted: 2003-12-17 10:18 UTC Modified: 2003-12-22 08:44 UTC
From: msw at seebi dot de Assigned:
Status: Not a bug Package: DOM XML related
PHP Version: 4.3.4 OS: Win XP Prof
Private report: No CVE-ID: None
 [2003-12-17 10:18 UTC] msw at seebi dot de
Description:
------------
When I'm producing a new xml file (domxml_new_doc())  which contains some decimal entities (e.g. for the german characters ß Ä ö and so on) and saving this file to disk (dump_file()) each character '&' in each decimal entity is converted into &amp. This appears with PHP 4.3.0 and 4.3.4 (Win). PHP 4.3.1, 4.3.2 and 4.3.3 not tested. 

Reproduce code:
---------------
<?php


$content="&#223; - &#196;";

$dom = domxml_new_doc("1.0");
$root = $dom->add_root("list");

 $ab=$dom->create_element("absatz");
 $ab->set_attribute("id","10");
 $text=$dom->create_element("text");
 $content=$dom->create_text_node($content);
 $text->append_child($content);
 $ab->append_child($text);
 $root->append_child($ab);

$dom->dump_file("test.xml", false, false);


?>


Expected result:
----------------
the file test.xml should look like this:

<?xml version="1.0"?>
<list><absatz id="10"><text>&#223; -&#196;</text></absatz></list>

Actual result:
--------------
The actual result is:

<?xml version="1.0"?>
<list><absatz id="10"><text>&amp;#223; -&amp;#196;</text></absatz></list>

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2003-12-22 08:18 UTC] msw at seebi dot de
As you suggested, I've checked out the new stable release. The result is the same, each character '&' is converted into &amp.
 [2003-12-22 08:44 UTC] moriyoshi@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

Any kinds of characters like "&" that have special meanings in the XML document will be escaped during the serialization. There's no such specification that defines entity-like strings to be handled specially.

If you want to use strings that contain a special character which cannot be represented by the encoding your script uses, or which cannot be entered via your keyboard, you need to convert the string to the appropriate encoding (UTF-8) first and then pass it to the DOM facility, as DOM extension assumes all passed strings are encoded in UTF-8 internally.
 
PHP Copyright © 2001-2020 The PHP Group
All rights reserved.
Last updated: Tue Dec 01 17:01:25 2020 UTC