php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #72992 mbstring.internal_encoding doesn't inherit default_charset
Submitted: 2016-09-01 12:00 UTC Modified: 2016-09-12 07:38 UTC
From: zoeslam at gmail dot com Assigned: yohgaki
Status: Closed Package: mbstring related
PHP Version: 7.0.10 OS: Ubuntu 16.04
Private report: No CVE-ID:
Password:
Status:
Package:
Bug Type:
Summary:
From: zoeslam at gmail dot com
New email:
PHP Version: OS:

 

 [2016-09-01 12:00 UTC] zoeslam at gmail dot com
Description:
------------
Likely related to https://bugs.php.net/bug.php?id=70035 but internal_encoding is now deprecated

php -d default_charset='ISO-8859-1' -i | grep -E '(charset|encoding)'

default_charset => ISO-8859-1 => ISO-8859-1
input_encoding => no value => no value
internal_encoding => no value => no value
output_encoding => no value => no value
zend.script_encoding => no value => no value
iconv.input_encoding => no value => no value
iconv.internal_encoding => no value => no value
iconv.output_encoding => no value => no value
HTTP input encoding translation => disabled
mbstring.encoding_translation => Off => Off
mbstring.internal_encoding => no value => no value

Test script:
---------------
php -d default_charset='ISO-8859-1' -r 'var_dump(mb_internal_encoding());'
php -r 'ini_set("default_charset", "ISO-8859-1"); var_dump(mb_internal_encoding());'

Expected result:
----------------
string(10) "ISO-8859-1"
string(10) "ISO-8859-1"

Actual result:
--------------
string(5) "UTF-8"
string(5) "UTF-8"

Patches

bug72992.patch (last revision 2016-09-08 04:12 UTC) by yohgaki@php.net)

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2016-09-06 08:55 UTC] yohgaki@php.net
-Assigned To: +Assigned To: yohgaki
 [2016-09-08 04:12 UTC] yohgaki@php.net
The following patch has been added/updated:

Patch Name: bug72992.patch
Revision:   1473307940
URL:        https://bugs.php.net/patch-display.php?bug=72992&patch=bug72992.patch&revision=1473307940
 [2016-09-08 04:14 UTC] yohgaki@php.net
INI could be propagated, but it's not necessary. If higher precedence INI is set, lower precedence INI is not set, higher is used.

I'll commit fix for logical error part only.
 [2016-09-08 04:57 UTC] yohgaki@php.net
Automatic comment on behalf of yohgaki
Revision: http://git.php.net/?p=php-src.git;a=commit;h=8bbd0952e5bba88426bac1596dcc3bfa504dbe4e
Log: Fix Bug #72992 mbstring.internal_encoding doesn't inherit default_charset
 [2016-09-08 04:57 UTC] yohgaki@php.net
-Status: Assigned +Status: Closed
 [2016-09-09 11:37 UTC] zoeslam at gmail dot com
Hi, I recompiled from source with your commit, but nothing changed.

Could you explain me better your comment about INI settings?

If I read the manual at http://php.net/manual/en/mbstring.configuration.php#ini.mbstring.internal-encoding I assume that, when every setting is left empty except "default_charset", MB will use "default_charset" value for the internal encoding, isn't so?

If it is so, the bug is still present
 [2016-09-10 01:57 UTC] yohgaki@php.net
So you would like to have INI propagation as attached patch?
 [2016-09-10 02:01 UTC] yohgaki@php.net
-Status: Closed +Status: Re-Opened
 [2016-09-10 02:01 UTC] yohgaki@php.net
I'll just commit INI propagation part of attached patch, too. Then you'll see modified INIs.
 [2016-09-12 01:10 UTC] yohgaki@php.net
-Status: Re-Opened +Status: Closed
 [2016-09-12 01:10 UTC] yohgaki@php.net
It seems current INI system cannot handler INI propagation at startup well. I wouldn't like to add ugly hacks. So I leave as it is now. 

Mbstring's INI value is not significant when higher precedence INI is set.
 [2016-09-12 01:10 UTC] yohgaki@php.net
It seems current INI system cannot handler INI propagation at startup well. I wouldn't like to add ugly hacks. So I leave as it is now. 

Mbstring's INI value is not significant when higher precedence INI is set.
 [2016-09-12 01:53 UTC] yohgaki@php.net
Mbstring's INI value is not significant when higher precedence INI is set when mbstring INI is not set.
 [2016-09-12 06:41 UTC] zoeslam at gmail dot com
I am sorry I still don't get what you say, but this is not important, I rely on your analysis.

If:
1) "mbstring.internal_encoding" ini setting is deprecated
2) "default_charset" doesn't propagate to mb_internal_encoding()
3) and there is no fix for that

At least the documentation should be updated to report that the only way to change mb_internal_encoding() is to call mb_internal_encoding() itself.

May we update the manual please?
 [2016-09-12 07:38 UTC] yohgaki@php.net
I noticed mb_internal_encoding() should be modified to reflect encoding actually used... Thank you. 

Anyway, current implementation does not change lower precedence INI values. Used encoding is determined like (this is iconv because iconv code is simpler)

static char *get_internal_encoding(void) {
	if (ICONVG(internal_encoding) && ICONVG(internal_encoding)[0]) {
		return ICONVG(internal_encoding);
	} else if (PG(internal_encoding) && PG(internal_encoding)[0]) {
		return PG(internal_encoding);
	} else if (SG(default_charset)) {
		return SG(default_charset);
	}
	return "";
}

As you can see, INI setting precedence is kept as defined in the RFC.

Mbstring's encoding handling is not simple as iconv, if I'm missing something please let me know.
 [2016-10-17 10:08 UTC] bwoebi@php.net
Automatic comment on behalf of yohgaki
Revision: http://git.php.net/?p=php-src.git;a=commit;h=8bbd0952e5bba88426bac1596dcc3bfa504dbe4e
Log: Fix Bug #72992 mbstring.internal_encoding doesn't inherit default_charset
 
PHP Copyright © 2001-2017 The PHP Group
All rights reserved.
Last updated: Sun Aug 20 05:01:44 2017 UTC