php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #71876 Memory corruption htmlspecialchars(): charset `*' not supported
Submitted: 2016-03-21 23:24 UTC Modified: 2020-01-03 10:04 UTC
Votes:9
Avg. Score:4.0 ± 1.1
Reproduced:8 of 8 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: the_djmaze at hotmail dot com Assigned: nikic (profile)
Status: Closed Package: Strings related
PHP Version: 7.0.8 OS: Fedora 22
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: the_djmaze at hotmail dot com
New email:
PHP Version: OS:

 

 [2016-03-21 23:24 UTC] the_djmaze at hotmail dot com
Description:
------------
Running the following code several times yields memory corruption.

Test script:
---------------
<?php
error_reporting(E_ALL);
ini_set('display_errors',1);
ini_set('log_errors',0);
class Text
{
	function run()
	{
		eval('static::echo_data(\'test\');');
	}
	protected static function echo_data($data)
	{
		echo htmlspecialchars($data, ENT_NOQUOTES);
	}
}
$test = new Text;
$test->run();
?>

Expected result:
----------------
No warning at all by using ini_get("default_charset")

Actual result:
--------------
Warning: htmlspecialchars(): charset `*' not supported
Where '*' is anything random.

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2016-03-21 23:37 UTC] requinix@php.net
-Status: Open +Status: Feedback
 [2016-03-21 23:37 UTC] requinix@php.net
I'm not sure why this is "memory corruption"?

https://3v4l.org/n4Qim

Which Linux? What's your configure command? Any Zend extensions installed? (Try without them.)
 [2016-03-22 09:12 UTC] the_djmaze at hotmail dot com
-Status: Feedback +Status: Open
 [2016-03-22 09:12 UTC] the_djmaze at hotmail dot com
Added a screenshot (with a call of phpinfo() after the code)
https://dragonflycms.org/images/php-bug-htmlspecialchars.png

Fedora 22 with remi repo

$ dnf list installed | grep php
php.x86_64                7.0.4-1.fc22.remi
php-bcmath.x86_64         7.0.4-1.fc22.remi
php-cli.x86_64            7.0.4-1.fc22.remi
php-common.x86_64         7.0.4-1.fc22.remi
php-devel.x86_64          7.0.4-1.fc22.remi
php-gd.x86_64             7.0.4-1.fc22.remi
php-gmp.x86_64            7.0.4-1.fc22.remi
php-imap.x86_64           7.0.4-1.fc22.remi
php-interbase.x86_64      7.0.4-1.fc22.remi
php-json.x86_64           7.0.4-1.fc22.remi
php-mbstring.x86_64       7.0.4-1.fc22.remi
php-mcrypt.x86_64         7.0.4-1.fc22.remi
php-mysqlnd.x86_64        7.0.4-1.fc22.remi
php-pdo.x86_64            7.0.4-1.fc22.remi
php-pear.noarch           1:1.10.1-1.fc22.remi
php-pecl-apcu.x86_64      5.1.3-1.fc22.remi.7.0
php-pecl-apcu-bc.x86_64   1.0.3-1.fc22.remi.7.0
php-pecl-gmagick.x86_64   2.0.2-0.3.RC2.fc22.remi.7.0
php-pecl-igbinary.x86_64  1.2.2-0.1.20151217git2b7c703.fc22.remi.7.0
php-pecl-mailparse.x86_64 3.0.1-1.fc22.remi.7.0
php-pecl-memcache.x86_64  3.0.9-0.2.20151130gitfdbd46b.fc22.remi.7.0
php-pecl-memcached.x86_64 3.0.0-0.1.20160217git6ace07d.fc22.remi.7.0
php-pecl-msgpack.x86_64   2.0.1-1.fc22.remi.7.0
php-pecl-uuid.x86_64      1.0.4-6.fc22.remi.7.0
php-pecl-yaml.x86_64      2.0.0-0.6.RC7.fc22.remi.7.0
php-pgsql.x86_64          7.0.4-1.fc22.remi
php-process.x86_64        7.0.4-1.fc22.remi
php-soap.x86_64           7.0.4-1.fc22.remi
php-tidy.x86_64           7.0.4-1.fc22.remi
php-xml.x86_64            7.0.4-1.fc22.remi

Zend Extension 	320151012
Zend Extension Build 	API320151012,NTS
Zend Signal Handling 	disabled
Zend Memory Manager 	enabled
Zend Multibyte Support 	provided by mbstring
zend.assertions	1
zend.detect_unicode	On
zend.enable_gc	On
zend.multibyte	Off
 [2016-03-22 09:18 UTC] the_djmaze at hotmail dot com
After a restart of Apache the problem is gone and not reproducible.
I understand this makes it very hard to find.
 [2016-03-22 14:39 UTC] the_djmaze at hotmail dot com
Few hours later, and the problem is back.
Restart of Apache solved the issue again.
 [2016-03-22 23:34 UTC] the_djmaze at hotmail dot com
-Operating System: Linux +Operating System: Fedora 22
 [2016-03-22 23:34 UTC] the_djmaze at hotmail dot com
Looking in ./ext/standard/html.c html_entity_decode() uses get_default_charset()
And also suffers from this problem.

Then looking at
static char *get_default_charset(void) {
	if (PG(internal_encoding) && PG(internal_encoding)[0]) {
		return PG(internal_encoding);
	} else if (SG(default_charset) && SG(default_charset)[0] ) {
		return SG(default_charset);
	}
	return NULL;
}

In php.ini internal_encoding is not set nor is default_charset.
Using ini_get() the first is empty and the latter says "UTF-8"

Digging deeper mbstring.c and iconv.c also you the char pointers.

mbstring.c only uses it once:
return _php_mb_ini_mbstring_internal_encoding_set(get_internal_encoding(), strlen(get_internal_encoding())+1);

iconv.c uses it in a lot of places but i don't have this module installed.
Maybe later i will to test if this is also affected.
 [2016-06-03 03:05 UTC] the_djmaze at hotmail dot com
Also seems to happen on a CentOS server with cPanel EasyApache PHP 5.6.19

[Sun May 29 17:17:00 2016] [error] PHP Warning:  html_entity_decode(): charset `@\xef\xbf\xbdf\x03' not supported, assuming utf-8 in /wp-content/plugins/dmsguestbook/admin.php on line 3529
[Sun May 29 17:31:58 2016] [error] PHP Warning:  htmlspecialchars(): charset `\x11\x01' not supported, assuming utf-8 in /plugins/system/sef/sef.php on line 49
[Sun May 29 17:32:13 2016] [error] PHP Warning:  htmlspecialchars(): charset `Filter object to use.\n\t *\n\t * @var    JFilterInput\n\t * @since  11.1\n\t */' not supported, assuming utf-8 in /libraries/cms/application/site.php on line 161

So something in the memory management is broken, i haven't figured out what yet.
 [2016-06-03 03:32 UTC] the_djmaze at hotmail dot com
Found some websites with the issue: (hit your browser F5 a few times to see it happen)

www.modelcity.cz/cz/?page_id=333&album=1&gallery=32
cms.w3host.hu/opencart/index.php?route=product/product&product_id=46

www.suchanoha.cz/?page_id=974&wppa-album=39&wppa-photo=490&wppa-cover=0&wppa-occur=1&wppa-single=1

ubytovani-ledenice.cz/?page_id=14

www.conceptvision.cz/index.php/8-slideshow/15-sli
 [2016-06-04 20:58 UTC] yohgaki@php.net
-Status: Open +Status: Feedback
 [2016-06-04 20:58 UTC] yohgaki@php.net
What is the default_charset setting? i.e. var_dump(ini_get('default_charset'));
 [2016-06-05 02:31 UTC] yohgaki@php.net
It sounds like zend_string handling issue. i.e. String is released, but pointer points to zend_string buffer. I've seen this type of bugs in 5.x to 7.0 transition. 

I cannot reproduce this, if you could narrow down specific condition reproducing this, I'll look into it.
 [2016-06-07 12:50 UTC] the_djmaze at hotmail dot com
-Status: Feedback +Status: Open
 [2016-06-07 12:50 UTC] the_djmaze at hotmail dot com
The below code is sufficient. PHP runs as Apache 2 module.
The problem arises when encoding is set in php.ini and not in your script itself en then run the script several times in Apache.

<?php
htmlspecialchars('some text', ENT_NOQUOTES);
phpinfo();
?>
 [2016-06-07 13:06 UTC] the_djmaze at hotmail dot com
-PHP Version: 7.0.4 +PHP Version: 7.0.6
 [2016-06-07 13:06 UTC] the_djmaze at hotmail dot com
Just tested with PHP 7.0.6 and can reproduce it there as well.
When some bigger memory usage occurs (say a CMS page), and then the above script is tested, it shows the error.
Just using the simple test on a fresh Apache daemon start it didn't show the error.

yohgaki you got me thinking, since PHP runs as a module, it stays loaded in memory.
So you are probably right that the zend_string gets freed because something did a ini_set('default_charset', 'UTF-8') (or not?!?) in 5.6 and 7?
 [2016-07-08 23:26 UTC] the_djmaze at hotmail dot com
Tested with PHP 7.0.8 still an issue.

When using
<?php
ini_set('internal_encoding', 'UTF-8');
?>
The issue is completely gone.
 [2016-07-08 23:39 UTC] the_djmaze at hotmail dot com
-PHP Version: 7.0.6 +PHP Version: 7.0.8
 [2016-07-08 23:39 UTC] the_djmaze at hotmail dot com
Found more using https://www.google.nl/search?q="Warning:+htmlspecialchars():+charset"+"not+supported"
 [2018-03-13 11:21 UTC] php_net at dlk dot pl
Happens the same with our cakephp project.

Sometimes it even show phpcode in place of charset.
While ini_get('default_charset') returning UTF-8

With whole project error rate is around 50%.
Hard to prepare small test-case because it doesn't return error or showing less frequently when i remove stuff.

Sometimes also generating HTTP 500:
php-cgi[14874]: segfault at 28d5588 ip 000000000078db23 sp 00007fff98b4e0e8 error 4 in php-cgi[400000+bdd000]


Examples when it doesn't segfault:
https://pastebin.com/AqkC7p8D

http://proxy.sec3.itdesk.eu/phpbug/bugtest.html < saved example output
 [2018-03-13 11:37 UTC] php_net at dlk dot pl
Temporary workaround is to put charset into function call:
html_entity_decode($x, null, 'utf-8');

https://pastebin.com/d1Z6631j

Also doing ini_set before doesn't work (ini_set/get broken?):
ini_set('default_charset', 'UTF-8');
$y = html_entity_decode($x);

https://pastebin.com/Dp5Aqw0Q
 [2018-12-07 15:25 UTC] irasha at yahoo dot com
I just ran into this issue on shared hosting account, and wanted to share the workaround solution, and a potential security concern with this bug.

Shared hosting account (php version 7.0.32), running WordPress blog with a few plugins. error_log started to fill up 450mb a day with just these errors.

Modifying php.ini was not an option, as there's one for all accounts on that server. Changes to the code would be overwritten with WP/plugins updates.
The solution that worked was adding "internal_encoding utf-8" to Apache via include config (has to be done by hosting support rep).

I parsed gigabytes of these lines to see all the "charset" values I got there, and there were IPs, regexes, some numeric and text values, table names, paths to files from different accounts(!), etc.. almost all this was from someone else's accounts. I learned of two other websites that run on the same server just by skimming through these values, and knew the login names to their accounts from the paths. Makes me wonder how much as a security risk this bug can be on shared hosting.
 [2020-01-03 10:04 UTC] nikic@php.net
-Assigned To: +Assigned To: nikic
 [2020-01-03 10:04 UTC] nikic@php.net
Fixed in 7.4.2 with https://github.com/php/php-src/commit/fcdc0a6db0ae63fbed9e3828137b899b844623ce. I didn't notice at the time that this is actually a pre-existing issue, will have to backport this change.
 [2020-01-03 10:16 UTC] nikic@php.net
Automatic comment on behalf of nikita.ppv@gmail.com
Revision: http://git.php.net/?p=php-src.git;a=commit;h=1f9e93687c0ceb442ef608b894427ada11ac06fc
Log: Fixed bug #71876
 [2020-01-03 10:16 UTC] nikic@php.net
-Status: Assigned +Status: Closed
 [2020-01-17 08:48 UTC] nikic@php.net
Automatic comment on behalf of nikita.ppv@gmail.com
Revision: http://git.php.net/?p=php-src.git;a=commit;h=018251a7c492916a6fa2c0e9a5e7adaba14bd614
Log: Fixed bug #71876
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Mar 28 14:01:29 2024 UTC