php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #52213 htmlspecialchars() encodes & and — in a wrong way
Submitted: 2010-06-30 17:54 UTC Modified: 2010-06-30 18:53 UTC
From: tomas at matfyz dot cz Assigned:
Status: Not a bug Package: *General Issues
PHP Version: 5.2.13 OS: Linux niobe 2.6.25-gentoo-r8 #1
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: tomas at matfyz dot cz
New email:
PHP Version: OS:

 

 [2010-06-30 17:54 UTC] tomas at matfyz dot cz
Description:
------------
The function htmlspecialchars() encodes the & character even if it is part of some html entity like & or — .

The workaround is also difficult because the function doesn't allow to disable replacing of the & symbol (I believe it should).

PHP version 



Test script:
---------------
echo htmlspecialchars("&");
echo htmlspecialchars("—");

Expected result:
----------------
&
—

Actual result:
--------------
&
—

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-06-30 17:57 UTC] rasmus@php.net
-Status: Open +Status: Bogus
 [2010-06-30 17:57 UTC] rasmus@php.net
That's what the double-encode parameter is for.  Set it to false and it won't 
double-encode.
 [2010-06-30 18:22 UTC] tomas at matfyz dot cz
So why it is not the default? This is problem with many PHP functions: that the expected behaviour is not the default one (it is contra intuitive). 

Or, if not a default value, at least there should be a red box warning in the documentation!
 [2010-06-30 18:23 UTC] tomas at matfyz dot cz
-Type: Bug +Type: Feature/Change Request
 [2010-06-30 18:23 UTC] tomas at matfyz dot cz
changing to feature request for the documentation
 [2010-06-30 18:53 UTC] rasmus@php.net
Because we want to default to the safest case.  It is not always safe to skip 
encoding a & even if it is part of an entity.  For example, inside on* handler 
attributes and style attributes, you have to double-encode or you will be 
vulnerable to XSS attacks.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Fri Aug 15 09:00:03 2025 UTC