php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #64010 htmlentities fundamentally broken in 5.4
Submitted: 2013-01-17 12:36 UTC Modified: 2013-01-17 13:23 UTC
From: spam2 at rhsoft dot net Assigned:
Status: Not a bug Package: Scripting Engine problem
PHP Version: 5.4.10 OS: Linux
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: spam2 at rhsoft dot net
New email:
PHP Version: OS:

 

 [2013-01-17 12:36 UTC] spam2 at rhsoft dot net
Description:
------------
> Like htmlspecialchars(), htmlentities() takes an optional third 
> argument encoding which defines encoding used in conversion. If 
> omitted, the default value for this argument is ISO-8859-1 in 
> versions of PHP prior to 5.4.0, and UTF-8 from PHP 5.4.0 onwards

and you broke randomly applications with this
without specifiy 'ISO-8859-1' we get randomly EMPTY STRINGS back 

[harry@rh:/downloads/htmlentities]$ ./test.php 
--------------------------------------------------------------------
strlen($input):
4464
--------------------------------------------------------------------
strlen(htmlentities($input, ENT_QUOTES)):
0
--------------------------------------------------------------------
strlen(htmlentities($input, ENT_QUOTES, 'ISO-8859-1')):
6522


Test script:
---------------
#!/usr/bin/php
<?php
 $input = base64_decode(file_get_contents(__DIR__ . '/70acc70b9c93b6a677825241e8165562_base64.txt'));
 echo '--------------------------------------------------------------------' . "\n";
 echo 'strlen($input):' . "\n";
 echo strlen($input) . "\n";
 echo '--------------------------------------------------------------------' . "\n";
 echo 'strlen(htmlentities($input, ENT_QUOTES)):' . "\n";
 echo strlen(htmlentities($input, ENT_QUOTES)) . "\n";
 echo '--------------------------------------------------------------------' . "\n";
 echo 'strlen(htmlentities($input, ENT_QUOTES, \'ISO-8859-1\')):' . "\n";
 echo strlen(htmlentities($input, ENT_QUOTES, 'ISO-8859-1')) . "\n";
?>

Expected result:
----------------
NON-EMPTY reuturn value


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2013-01-17 12:40 UTC] spam2 at rhsoft dot net
WTF - why can i not submit a simple zip containing the spmale input base64_encoded in a seperate file because here you have only the
option to attach patches
 [2013-01-17 13:08 UTC] spam2 at rhsoft dot net
and NO it is not a smart idea to change the complete default behavior
it is bullshit, if your page is ISO-8859-1 and you do htmlentities('üöä') it is fundamentally broken to return empty strings in a random number of funtions
 [2013-01-17 13:23 UTC] rasmus@php.net
If your page is ISO-8859-1 and you are using that as your internal encoding as 
well, then you need to specify that. Otherwise it leads to security issues. And 
since most people don't use ISO-8859-1 anymore, the safer default is to make sure 
we don't output invalid UTF-8 byte sequences when the developer has not specified 
the encoding.
 [2013-01-17 13:23 UTC] rasmus@php.net
-Status: Open +Status: Not a bug
 [2013-01-17 13:33 UTC] spam2 at rhsoft dot net
as long as PHP at whole is NOT really capable UTF8 it is bullshit to assume that any input is UTF8 as default
 [2013-01-17 13:35 UTC] spam2 at rhsoft dot net
and if you guys would be smart there would be an php.ini setting to specify the bahvior globally and/or per <Directory> instead hardcode incompatible changes breaking nearly ANY code written without wrappers
 [2013-01-17 18:41 UTC] spam2 at rhsoft dot net
and last but not least WTF did whoever implemented the bullshit returning an emptry string WITHOUT THROW A WARNING AT LEAST - who do you guys imagine that admins/developers which are running in E_ALL | E_STRICT since years smell if there something is still broken and need to get fixed?
 [2013-09-02 18:05 UTC] spam2 at rhsoft dot net
and again a broken backend on a production server running E_ALL reporting because the braindead idiot who made this change was not smart enugh to throw a *warning* if it returs an empty string while the input was not empty

how stupid can developers act?
 [2013-09-19 14:51 UTC] andrebruce at gmail dot com
Hello,

I found this bug report searching for htmlentities broken

I am seeing some broken applications like phppgadmin shipped with Ubuntu because of this change.

Making it possible to change it globally on php.ini (or using the default_charset from php.ini) would be really interesting.

Thanks.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sun May 05 22:01:29 2024 UTC