|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #71876 Memory corruption htmlspecialchars(): charset `*' not supported
Submitted: 2016-03-21 23:24 UTC Modified: 2020-01-03 10:04 UTC
Avg. Score:4.0 ± 1.1
Reproduced:8 of 8 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: the_djmaze at hotmail dot com Assigned: nikic (profile)
Status: Closed Package: Strings related
PHP Version: 7.0.8 OS: Fedora 22
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please !
Your email address:
Solve the problem:
48 + 6 = ?
Subscribe to this entry?

 [2016-03-21 23:24 UTC] the_djmaze at hotmail dot com
Running the following code several times yields memory corruption.

Test script:
class Text
	function run()
	protected static function echo_data($data)
		echo htmlspecialchars($data, ENT_NOQUOTES);
$test = new Text;

Expected result:
No warning at all by using ini_get("default_charset")

Actual result:
Warning: htmlspecialchars(): charset `*' not supported
Where '*' is anything random.


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2016-03-21 23:37 UTC]
-Status: Open +Status: Feedback
 [2016-03-21 23:37 UTC]
I'm not sure why this is "memory corruption"?

Which Linux? What's your configure command? Any Zend extensions installed? (Try without them.)
 [2016-03-22 09:12 UTC] the_djmaze at hotmail dot com
-Status: Feedback +Status: Open
 [2016-03-22 09:12 UTC] the_djmaze at hotmail dot com
Added a screenshot (with a call of phpinfo() after the code)

Fedora 22 with remi repo

$ dnf list installed | grep php
php.x86_64                7.0.4-1.fc22.remi
php-bcmath.x86_64         7.0.4-1.fc22.remi
php-cli.x86_64            7.0.4-1.fc22.remi
php-common.x86_64         7.0.4-1.fc22.remi
php-devel.x86_64          7.0.4-1.fc22.remi
php-gd.x86_64             7.0.4-1.fc22.remi
php-gmp.x86_64            7.0.4-1.fc22.remi
php-imap.x86_64           7.0.4-1.fc22.remi
php-interbase.x86_64      7.0.4-1.fc22.remi
php-json.x86_64           7.0.4-1.fc22.remi
php-mbstring.x86_64       7.0.4-1.fc22.remi
php-mcrypt.x86_64         7.0.4-1.fc22.remi
php-mysqlnd.x86_64        7.0.4-1.fc22.remi
php-pdo.x86_64            7.0.4-1.fc22.remi
php-pear.noarch           1:1.10.1-1.fc22.remi
php-pecl-apcu.x86_64      5.1.3-1.fc22.remi.7.0
php-pecl-apcu-bc.x86_64   1.0.3-1.fc22.remi.7.0
php-pecl-gmagick.x86_64   2.0.2-0.3.RC2.fc22.remi.7.0
php-pecl-igbinary.x86_64  1.2.2-0.1.20151217git2b7c703.fc22.remi.7.0
php-pecl-mailparse.x86_64 3.0.1-1.fc22.remi.7.0
php-pecl-memcache.x86_64  3.0.9-0.2.20151130gitfdbd46b.fc22.remi.7.0
php-pecl-memcached.x86_64 3.0.0-0.1.20160217git6ace07d.fc22.remi.7.0
php-pecl-msgpack.x86_64   2.0.1-1.fc22.remi.7.0
php-pecl-uuid.x86_64      1.0.4-6.fc22.remi.7.0
php-pecl-yaml.x86_64      2.0.0-0.6.RC7.fc22.remi.7.0
php-pgsql.x86_64          7.0.4-1.fc22.remi
php-process.x86_64        7.0.4-1.fc22.remi
php-soap.x86_64           7.0.4-1.fc22.remi
php-tidy.x86_64           7.0.4-1.fc22.remi
php-xml.x86_64            7.0.4-1.fc22.remi

Zend Extension 	320151012
Zend Extension Build 	API320151012,NTS
Zend Signal Handling 	disabled
Zend Memory Manager 	enabled
Zend Multibyte Support 	provided by mbstring
zend.assertions	1
zend.detect_unicode	On
zend.enable_gc	On
zend.multibyte	Off
 [2016-03-22 09:18 UTC] the_djmaze at hotmail dot com
After a restart of Apache the problem is gone and not reproducible.
I understand this makes it very hard to find.
 [2016-03-22 14:39 UTC] the_djmaze at hotmail dot com
Few hours later, and the problem is back.
Restart of Apache solved the issue again.
 [2016-03-22 23:34 UTC] the_djmaze at hotmail dot com
-Operating System: Linux +Operating System: Fedora 22
 [2016-03-22 23:34 UTC] the_djmaze at hotmail dot com
Looking in ./ext/standard/html.c html_entity_decode() uses get_default_charset()
And also suffers from this problem.

Then looking at
static char *get_default_charset(void) {
	if (PG(internal_encoding) && PG(internal_encoding)[0]) {
		return PG(internal_encoding);
	} else if (SG(default_charset) && SG(default_charset)[0] ) {
		return SG(default_charset);
	return NULL;

In php.ini internal_encoding is not set nor is default_charset.
Using ini_get() the first is empty and the latter says "UTF-8"

Digging deeper mbstring.c and iconv.c also you the char pointers.

mbstring.c only uses it once:
return _php_mb_ini_mbstring_internal_encoding_set(get_internal_encoding(), strlen(get_internal_encoding())+1);

iconv.c uses it in a lot of places but i don't have this module installed.
Maybe later i will to test if this is also affected.
 [2016-06-03 03:05 UTC] the_djmaze at hotmail dot com
Also seems to happen on a CentOS server with cPanel EasyApache PHP 5.6.19

[Sun May 29 17:17:00 2016] [error] PHP Warning:  html_entity_decode(): charset `@\xef\xbf\xbdf\x03' not supported, assuming utf-8 in /wp-content/plugins/dmsguestbook/admin.php on line 3529
[Sun May 29 17:31:58 2016] [error] PHP Warning:  htmlspecialchars(): charset `\x11\x01' not supported, assuming utf-8 in /plugins/system/sef/sef.php on line 49
[Sun May 29 17:32:13 2016] [error] PHP Warning:  htmlspecialchars(): charset `Filter object to use.\n\t *\n\t * @var    JFilterInput\n\t * @since  11.1\n\t */' not supported, assuming utf-8 in /libraries/cms/application/site.php on line 161

So something in the memory management is broken, i haven't figured out what yet.
 [2016-06-03 03:32 UTC] the_djmaze at hotmail dot com
Found some websites with the issue: (hit your browser F5 a few times to see it happen)
 [2016-06-04 20:58 UTC]
-Status: Open +Status: Feedback
 [2016-06-04 20:58 UTC]
What is the default_charset setting? i.e. var_dump(ini_get('default_charset'));
 [2016-06-05 02:31 UTC]
It sounds like zend_string handling issue. i.e. String is released, but pointer points to zend_string buffer. I've seen this type of bugs in 5.x to 7.0 transition. 

I cannot reproduce this, if you could narrow down specific condition reproducing this, I'll look into it.
 [2016-06-07 12:50 UTC] the_djmaze at hotmail dot com
-Status: Feedback +Status: Open
 [2016-06-07 12:50 UTC] the_djmaze at hotmail dot com
The below code is sufficient. PHP runs as Apache 2 module.
The problem arises when encoding is set in php.ini and not in your script itself en then run the script several times in Apache.

htmlspecialchars('some text', ENT_NOQUOTES);
 [2016-06-07 13:06 UTC] the_djmaze at hotmail dot com
-PHP Version: 7.0.4 +PHP Version: 7.0.6
 [2016-06-07 13:06 UTC] the_djmaze at hotmail dot com
Just tested with PHP 7.0.6 and can reproduce it there as well.
When some bigger memory usage occurs (say a CMS page), and then the above script is tested, it shows the error.
Just using the simple test on a fresh Apache daemon start it didn't show the error.

yohgaki you got me thinking, since PHP runs as a module, it stays loaded in memory.
So you are probably right that the zend_string gets freed because something did a ini_set('default_charset', 'UTF-8') (or not?!?) in 5.6 and 7?
 [2016-07-08 23:26 UTC] the_djmaze at hotmail dot com
Tested with PHP 7.0.8 still an issue.

When using
ini_set('internal_encoding', 'UTF-8');
The issue is completely gone.
 [2016-07-08 23:39 UTC] the_djmaze at hotmail dot com
-PHP Version: 7.0.6 +PHP Version: 7.0.8
 [2016-07-08 23:39 UTC] the_djmaze at hotmail dot com
Found more using"Warning:+htmlspecialchars():+charset"+"not+supported"
 [2018-03-13 11:21 UTC] php_net at dlk dot pl
Happens the same with our cakephp project.

Sometimes it even show phpcode in place of charset.
While ini_get('default_charset') returning UTF-8

With whole project error rate is around 50%.
Hard to prepare small test-case because it doesn't return error or showing less frequently when i remove stuff.

Sometimes also generating HTTP 500:
php-cgi[14874]: segfault at 28d5588 ip 000000000078db23 sp 00007fff98b4e0e8 error 4 in php-cgi[400000+bdd000]

Examples when it doesn't segfault: < saved example output
 [2018-03-13 11:37 UTC] php_net at dlk dot pl
Temporary workaround is to put charset into function call:
html_entity_decode($x, null, 'utf-8');

Also doing ini_set before doesn't work (ini_set/get broken?):
ini_set('default_charset', 'UTF-8');
$y = html_entity_decode($x);
 [2018-12-07 15:25 UTC] irasha at yahoo dot com
I just ran into this issue on shared hosting account, and wanted to share the workaround solution, and a potential security concern with this bug.

Shared hosting account (php version 7.0.32), running WordPress blog with a few plugins. error_log started to fill up 450mb a day with just these errors.

Modifying php.ini was not an option, as there's one for all accounts on that server. Changes to the code would be overwritten with WP/plugins updates.
The solution that worked was adding "internal_encoding utf-8" to Apache via include config (has to be done by hosting support rep).

I parsed gigabytes of these lines to see all the "charset" values I got there, and there were IPs, regexes, some numeric and text values, table names, paths to files from different accounts(!), etc.. almost all this was from someone else's accounts. I learned of two other websites that run on the same server just by skimming through these values, and knew the login names to their accounts from the paths. Makes me wonder how much as a security risk this bug can be on shared hosting.
 [2020-01-03 10:04 UTC]
-Assigned To: +Assigned To: nikic
 [2020-01-03 10:04 UTC]
Fixed in 7.4.2 with I didn't notice at the time that this is actually a pre-existing issue, will have to backport this change.
 [2020-01-03 10:16 UTC]
Automatic comment on behalf of
Log: Fixed bug #71876
 [2020-01-03 10:16 UTC]
-Status: Assigned +Status: Closed
 [2020-01-17 08:48 UTC]
Automatic comment on behalf of
Log: Fixed bug #71876
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Jun 20 23:01:29 2024 UTC