php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #52213 htmlspecialchars() encodes & and — in a wrong way
Submitted: 2010-06-30 17:54 UTC Modified: 2010-06-30 18:53 UTC
From: tomas at matfyz dot cz Assigned:
Status: Not a bug Package: *General Issues
PHP Version: 5.2.13 OS: Linux niobe 2.6.25-gentoo-r8 #1
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: tomas at matfyz dot cz
New email:
PHP Version: OS:

 

 [2010-06-30 17:54 UTC] tomas at matfyz dot cz
Description:
------------
The function htmlspecialchars() encodes the & character even if it is part of some html entity like & or — .

The workaround is also difficult because the function doesn't allow to disable replacing of the & symbol (I believe it should).

PHP version 



Test script:
---------------
echo htmlspecialchars("&");
echo htmlspecialchars("—");

Expected result:
----------------
&
—

Actual result:
--------------
&
—

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-06-30 17:57 UTC] rasmus@php.net
-Status: Open +Status: Bogus
 [2010-06-30 17:57 UTC] rasmus@php.net
That's what the double-encode parameter is for.  Set it to false and it won't 
double-encode.
 [2010-06-30 18:22 UTC] tomas at matfyz dot cz
So why it is not the default? This is problem with many PHP functions: that the expected behaviour is not the default one (it is contra intuitive). 

Or, if not a default value, at least there should be a red box warning in the documentation!
 [2010-06-30 18:23 UTC] tomas at matfyz dot cz
-Type: Bug +Type: Feature/Change Request
 [2010-06-30 18:23 UTC] tomas at matfyz dot cz
changing to feature request for the documentation
 [2010-06-30 18:53 UTC] rasmus@php.net
Because we want to default to the safest case.  It is not always safe to skip 
encoding a & even if it is part of an entity.  For example, inside on* handler 
attributes and style attributes, you have to double-encode or you will be 
vulnerable to XSS attacks.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Fri Aug 15 11:00:02 2025 UTC