php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #66766 htmlspecialchars() request
Submitted: 2014-02-25 00:32 UTC Modified: 2015-05-18 02:15 UTC
Votes:2
Avg. Score:4.0 ± 1.0
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:1 (100.0%)
From: sillywilly at hotmail dot com Assigned: cmb (profile)
Status: Closed Package: Unknown/Other Function
PHP Version: Irrelevant OS:
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: sillywilly at hotmail dot com
New email:
PHP Version: OS:

 

 [2014-02-25 00:32 UTC] sillywilly at hotmail dot com
Description:
------------
htmlspecialchars() and other such functions returns an empty string in PHP 5.4 and above when Latin1 characters are passed. Where in previous versions Latin1 characters were expected as the default. This makes upgrading to PHP 5.4 difficult because it breaks backward-compatibility. When in fact this function should continue to work normally. See below.

Expected result:
----------------
You should not need to specify what your character set is for this function as long as ASCII is a subset of the character set. (Examples: UTF-8, cp1252, etc) as the search and replace behavior is the same for all such character sets. You should only need to specify your specific character set if ASCII isn't a subset character set so that the search and replace behavior can be adjusted. That would be an improvement and would make legacy use of htmlspecialchars() that expects it to work on latin1 characters backward compatible. As it is now, most people are writing their own functions for backward compatibility rather than passing as a parameter that they're still using the Latin1 charset. This is annoying because there's no reason that this function needs to stop working in the first place.

Actual result:
--------------
this function returns an empty string when latin1 characters are passed. People upgrading to PHP 5.4 wind up with broken code, and are forced to debug.

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2014-02-25 03:02 UTC] rasmus@php.net
But Latin1 is not a subset of UTF-8. You are getting an empty string returned because the string contains a Latin1 character which is invalid in UTF-8. Most web sites these days work in UTF-8 which means if you are blindly filtering using Latin1 in htmlspecialchars() and outputting UTF-8 as most sites do these days, you have a potential security hole. That was the reason for this change.
 [2014-02-25 05:32 UTC] yohgaki@php.net
You may wait for PHP 5.6. When you set "default_charset", it will be used all over the place and you will not have issues you described.

I think this request may be closed.
Please test PHP 5.6 alpha and report issues if you find.
 [2015-05-18 02:15 UTC] cmb@php.net
-Status: Open +Status: Closed -Assigned To: +Assigned To: cmb
 [2015-05-18 02:15 UTC] cmb@php.net
Indeed, the ini setting default_charset is the default of the $encoding parameter of htmlspecialchars() as of PHP 5.6.0. It's highly unlikely that anything will be changed in this regard for lower versions, so I'm closing this ticket.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Apr 23 06:01:30 2024 UTC