php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #46835 saveHTML automatically replaces unicode letters to entities
Submitted: 2008-12-11 12:16 UTC Modified: 2008-12-12 04:57 UTC
From: kasparsj at gmail dot com Assigned:
Status: Not a bug Package: DOM XML related
PHP Version: 5.2.8 OS:
Private report: No CVE-ID: None
 [2008-12-11 12:16 UTC] kasparsj at gmail dot com
Description:
------------
DOMDocument->saveHTML replaces not only predefined entities, but also unicode letters, like ā, š, ē. is this also as expected or a bug?

this is related to:
http://bugs.php.net/bug.php?id=37878

Reproduce code:
---------------
$doc = new DOMDocument('1.0', 'UTF-8');
$doc->substituteEntities = false;
$doc->appendChild($doc->createElement('p', 'šaēeā'));
var_dump($doc->saveHTML());

Expected result:
----------------
<p>šaēeā</p>

Actual result:
--------------
<p>&copy;&scaron;a&#275;e&#257;</p>

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-12-12 04:57 UTC] rrichards@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

This is the behavior exhibited from libxml2 when outputting in HTML 
format
 [2013-07-13 01:06 UTC] wahabmirjan at yahoo dot com
As of July 12, 2013, almost 5 years after this problem is reported, this still is a problem. Please fix it.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Mar 29 01:01:28 2024 UTC