php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #38316 html_entity_decode() unexpected results
Submitted: 2006-08-03 16:27 UTC Modified: 2014-07-13 01:01 UTC
From: raymond at rnamusic dot com Assigned: yohgaki (profile)
Status: Closed Package: *General Issues
PHP Version: 4.4.3 OS: Linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: raymond at rnamusic dot com
New email:
PHP Version: OS:

 

 [2006-08-03 16:27 UTC] raymond at rnamusic dot com
Description:
------------
In all example code, and in all php functions, I can not find a simple snipet that will find html enties that are attached to characters (e.g. "é" a unicode construct) and decode them properly (to "é").

The string "Japrisot, Sébastien" is just ignored by html_entity_decode() and returned as is -- nothing changed.

The only solution seems to write a custom replacement function, which seems a bit odd since html_entity_decode purports to decode common entities.

If you work with marc records, as I do you come across these entities all the time.

Reproduce code:
---------------
<?php
$string = "Japrisot, Se&#x301;bastien";
$decoded = html_entity_decode($string);
echo $decoded;
?>

Expected result:
----------------
Japrisot, Se&#769;bastien

Actual result:
--------------
Japrisot, Se&#x301;bastien

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-08-03 16:28 UTC] raymond at rnamusic dot com
not your bug submission script translates my example ascii char into an entity, so where you read "e&#769;" should be a sigle ascii character.

fyi.
 [2011-10-06 05:23 UTC] reg dot php at alf dot nu
The documentation (well, the signature at the top) claims the third argument 
defaults to UTF-8, this is wrong.

You want

   html_entity_decode($string, ENT_QUOTES, 'UTF-8')
 [2014-07-13 01:01 UTC] yohgaki@php.net
-Status: Open +Status: Closed -Package: Feature/Change Request +Package: *General Issues -Assigned To: +Assigned To: yohgaki
 [2014-07-13 01:01 UTC] yohgaki@php.net
I think this could be closed. Please re-open if you still have similar issue.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Mon Jun 02 15:01:26 2025 UTC