php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #46835 saveHTML automatically replaces unicode letters to entities
Submitted: 2008-12-11 12:16 UTC Modified: 2008-12-12 04:57 UTC
From: kasparsj at gmail dot com Assigned:
Status: Not a bug Package: DOM XML related
PHP Version: 5.2.8 OS:
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: kasparsj at gmail dot com
New email:
PHP Version: OS:

 

 [2008-12-11 12:16 UTC] kasparsj at gmail dot com
Description:
------------
DOMDocument->saveHTML replaces not only predefined entities, but also unicode letters, like ā, š, ē. is this also as expected or a bug?

this is related to:
http://bugs.php.net/bug.php?id=37878

Reproduce code:
---------------
$doc = new DOMDocument('1.0', 'UTF-8');
$doc->substituteEntities = false;
$doc->appendChild($doc->createElement('p', 'šaēeā'));
var_dump($doc->saveHTML());

Expected result:
----------------
<p>šaēeā</p>

Actual result:
--------------
<p>&copy;&scaron;a&#275;e&#257;</p>

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2008-12-12 04:57 UTC] rrichards@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

This is the behavior exhibited from libxml2 when outputting in HTML 
format
 [2013-07-13 01:06 UTC] wahabmirjan at yahoo dot com
As of July 12, 2013, almost 5 years after this problem is reported, this still is a problem. Please fix it.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Fri Mar 14 09:01:29 2025 UTC