php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #64010 htmlentities fundamentally broken in 5.4
Submitted: 2013-01-17 12:36 UTC Modified: 2013-01-17 13:23 UTC
From: spam2 at rhsoft dot net Assigned:
Status: Not a bug Package: Scripting Engine problem
PHP Version: 5.4.10 OS: Linux
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: spam2 at rhsoft dot net
New email:
PHP Version: OS:

 

 [2013-01-17 12:36 UTC] spam2 at rhsoft dot net
Description:
------------
> Like htmlspecialchars(), htmlentities() takes an optional third 
> argument encoding which defines encoding used in conversion. If 
> omitted, the default value for this argument is ISO-8859-1 in 
> versions of PHP prior to 5.4.0, and UTF-8 from PHP 5.4.0 onwards

and you broke randomly applications with this
without specifiy 'ISO-8859-1' we get randomly EMPTY STRINGS back 

[harry@rh:/downloads/htmlentities]$ ./test.php 
--------------------------------------------------------------------
strlen($input):
4464
--------------------------------------------------------------------
strlen(htmlentities($input, ENT_QUOTES)):
0
--------------------------------------------------------------------
strlen(htmlentities($input, ENT_QUOTES, 'ISO-8859-1')):
6522


Test script:
---------------
#!/usr/bin/php
<?php
 $input = base64_decode(file_get_contents(__DIR__ . '/70acc70b9c93b6a677825241e8165562_base64.txt'));
 echo '--------------------------------------------------------------------' . "\n";
 echo 'strlen($input):' . "\n";
 echo strlen($input) . "\n";
 echo '--------------------------------------------------------------------' . "\n";
 echo 'strlen(htmlentities($input, ENT_QUOTES)):' . "\n";
 echo strlen(htmlentities($input, ENT_QUOTES)) . "\n";
 echo '--------------------------------------------------------------------' . "\n";
 echo 'strlen(htmlentities($input, ENT_QUOTES, \'ISO-8859-1\')):' . "\n";
 echo strlen(htmlentities($input, ENT_QUOTES, 'ISO-8859-1')) . "\n";
?>

Expected result:
----------------
NON-EMPTY reuturn value


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2013-01-17 12:40 UTC] spam2 at rhsoft dot net
WTF - why can i not submit a simple zip containing the spmale input base64_encoded in a seperate file because here you have only the
option to attach patches
 [2013-01-17 13:08 UTC] spam2 at rhsoft dot net
and NO it is not a smart idea to change the complete default behavior
it is bullshit, if your page is ISO-8859-1 and you do htmlentities('üöä') it is fundamentally broken to return empty strings in a random number of funtions
 [2013-01-17 13:23 UTC] rasmus@php.net
If your page is ISO-8859-1 and you are using that as your internal encoding as 
well, then you need to specify that. Otherwise it leads to security issues. And 
since most people don't use ISO-8859-1 anymore, the safer default is to make sure 
we don't output invalid UTF-8 byte sequences when the developer has not specified 
the encoding.
 [2013-01-17 13:23 UTC] rasmus@php.net
-Status: Open +Status: Not a bug
 [2013-01-17 13:33 UTC] spam2 at rhsoft dot net
as long as PHP at whole is NOT really capable UTF8 it is bullshit to assume that any input is UTF8 as default
 [2013-01-17 13:35 UTC] spam2 at rhsoft dot net
and if you guys would be smart there would be an php.ini setting to specify the bahvior globally and/or per <Directory> instead hardcode incompatible changes breaking nearly ANY code written without wrappers
 [2013-01-17 18:41 UTC] spam2 at rhsoft dot net
and last but not least WTF did whoever implemented the bullshit returning an emptry string WITHOUT THROW A WARNING AT LEAST - who do you guys imagine that admins/developers which are running in E_ALL | E_STRICT since years smell if there something is still broken and need to get fixed?
 [2013-09-02 18:05 UTC] spam2 at rhsoft dot net
and again a broken backend on a production server running E_ALL reporting because the braindead idiot who made this change was not smart enugh to throw a *warning* if it returs an empty string while the input was not empty

how stupid can developers act?
 [2013-09-19 14:51 UTC] andrebruce at gmail dot com
Hello,

I found this bug report searching for htmlentities broken

I am seeing some broken applications like phppgadmin shipped with Ubuntu because of this change.

Making it possible to change it globally on php.ini (or using the default_charset from php.ini) would be really interesting.

Thanks.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Fri Apr 25 08:01:28 2025 UTC