php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #52712 html_entity_decode does not support all standard entities
Submitted: 2010-08-27 06:01 UTC Modified: 2010-08-27 06:17 UTC
From: matias dot perrone at gmail dot com Assigned:
Status: Not a bug Package: Strings related
PHP Version: 5.2.14 OS: Windows 7
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: matias dot perrone at gmail dot com
New email:
PHP Version: OS:

 

 [2010-08-27 06:01 UTC] matias dot perrone at gmail dot com
Description:
------------
The function "html_entity_decode" does not support all html entities as documented 
in http://www.w3.org/TR/html4/sgml/entities.html


Test script:
---------------
$sEntities = '’ ‘ “ ” € ˆ';
echo "Start: ".$sEntities."\n";
$sEntities = html_entity_decode(($sEntities), ENT_QUOTES, "ISO-8859-1");
echo "Result: ".$sEntities;

Expected result:
----------------
Start: ’ ‘ “ ” € ˆ
Result: ’ ‘ “ ” € ˆ


Actual result:
--------------
Start: ’ ‘ “ ” € ˆ
Result: ’ ‘ “ ” € ˆ

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-08-27 06:17 UTC] aharvey@php.net
-Status: Open +Status: Bogus
 [2010-08-27 06:17 UTC] aharvey@php.net
html_entity_decode() can only decode entities that exist in the given
character set. None of your example entities occur in ISO-8859-1,
therefore they have to be left as entities. To see this in action: if
you change the character set to ISO-8859-15, the € entity does get
correctly decoded, since ISO-8859-15 added the € character to
ISO-8859-1.

You'd be much better off using a Unicode character set like UTF-8,
since that can represent all of the characters defined by HTML
entities.

Not a bug; closing.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Thu Jul 03 12:01:33 2025 UTC