php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #61228 htmlspecialchars() silently failing
Submitted: 2012-03-01 20:39 UTC Modified: 2012-03-02 00:29 UTC
Votes:2
Avg. Score:4.5 ± 0.5
Reproduced:2 of 2 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: keisial at gmail dot com Assigned:
Status: Wont fix Package: Unknown/Other Function
PHP Version: 5.4.0RC8 OS:
Private report: No CVE-ID: None
 [2012-03-01 20:39 UTC] keisial at gmail dot com
Description:
------------
htmlspecialchars() no longer provides warnings in PHP 5.4
This is specially worrying as 5.4 changes its default charset
from ISO-8859-1 to UTF-8.
So the same string that passed flawlessly through 5.3, will 
now silently output nothing in 5.4 (and htmlspecialchars
is one of the last things to check!).

In 5.3 the following can produce:
var_dump( htmlspecialchars("a\237a", ENT_COMPAT, 'UTF-8') );

PHP Warning:  htmlspecialchars(): Invalid multibyte sequence in argument in php shell code on line 1
string(0) ""

whereas in 5.4:
var_dump( htmlspecialchars("a\237a", ENT_COMPAT, 'UTF-8') );
string(0) ""


The explicit UTF-8 is to make both work the same, 
htmlspecialchars("a\237a") *works* in 5.3 (but it may not be 
in your page encoding).

The reason is clear, php_error_docref() of php_escape_html_entities_ex 
is gone in 5.4 and trunk.

I attach a patch against 5.4 branch readding the warning (should apply fine in 
trunk, moved 5 lines below)



Patches

htmlspecialchars.patch (last revision 2012-03-01 20:40 UTC by keisial at gmail dot com)

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2012-03-01 22:47 UTC] nikic@php.net
The main problem with that error was that it was very inconsistent:

It only was generated when error display was *disabled*. That basically meant that you would never see that error in development, but it would flood your log in production.

This was done for security reasons, in order to protect people who had display_errors=1 on production servers.

Especially as PHP 5.4 provides ENT_SUBSTITUTE I think that this error doesn't make much sense anymore.

But probably I'm wrong :)
 [2012-03-01 23:37 UTC] cataphract@php.net
-Status: Open +Status: Wont fix
 [2012-03-01 23:37 UTC] cataphract@php.net
This is intentional. The way PHP "warns" of invalid multibyte sequences is to return an empty string. The "hesitant" warning in 5.3 was not a good idea.
 [2012-03-02 00:29 UTC] keisial at gmail dot com
I agree the hesitant warning was a problem, but I'd rather prefer a warning in 
my logs than having to check htmlspecialchars() return value (in the end, 
creating a wrapper).
I'm not convinced that showing that warning on misconfigured servers was that a 
big deal (after all, if an attacker can influence the output providing invalid 
strings, he can as well see that they are no longer shown there), but it could 
be produced by forcing not sending it to the output buffer.

Get user data
Operate with it
Send to the db
Iterate the result set
Fetch the values from the result
Process that data
htmlspecialchars()
echo it

Maybe it's my fault for treating htmlspecialchars() as a function that would 
always work, but it made me look everywhere why it was failing. And the badly-
encoded data wasn't even provided by me, the culprit was strftime()

Combined with the charset change, I suspect it will bite a number of developers.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Tue Dec 10 08:01:27 2024 UTC