php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #62351 UTF-8 chars fail to be printed out properly with zend.multibyte
Submitted: 2012-06-18 15:18 UTC Modified: 2015-08-27 22:08 UTC
Votes:3
Avg. Score:3.7 ± 1.9
Reproduced:3 of 3 (100.0%)
Same Version:2 (66.7%)
Same OS:1 (33.3%)
From: php at sebastianmendel dot de Assigned: cmb (profile)
Status: Closed Package: *Configuration Issues
PHP Version: 5.4.4 OS: GNU/Linux
Private report: No CVE-ID: None
 [2012-06-18 15:18 UTC] php at sebastianmendel dot de
Description:
------------
Enabling zend.multibyte and having declare(encoding = UTF-8) in UTF-8 encoded scripts does not print UTF-8 chars properly.

Same script (still encoded as UTF-8) but with declare(encoding = ISO-8859-1) prints out UTF-8 chars correct.

>/opt/phpfarm/inst/bin/php-5.4.4 -i | grep multi
zend.multibyte => On => On

>/opt/phpfarm/inst/bin/php-5.4.4 -i | grep UTF
default_charset => UTF-8 => UTF-8
zend.script_encoding => UTF-8 => UTF-8
exif.encode_unicode => UTF-8 => UTF-8
iconv.input_encoding => UTF-8 => UTF-8
iconv.internal_encoding => UTF-8 => UTF-8
iconv.output_encoding => UTF-8 => UTF-8
LANG => de_DE.UTF-8
_SERVER["LANG"] => de_DE.UTF-8

Test script:
---------------
<?php
declare(encoding = 'UTF-8');
echo htmlspecialchars('"aäaß', ENT_QUOTES | ENT_IGNORE, 'UTF-8');
echo "\n" . '"aäaß';
?>




Expected result:
----------------
&quot;aäaß
"aäaß

Actual result:
--------------
&quot;aa
"a▒a▒

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2012-12-05 09:46 UTC] bukin242 at yandex dot ru
Please fix in the new versions of php
 [2013-02-11 16:23 UTC] alexander dot stehlik at gmail dot com
I'm not sure if this issue is related but here is the problem I have with 
declare(ENCODING = 'utf-8'):

Tested PHP Versions:
PHP 5.4.6-1ubuntu1.1 (cli)
PHP 5.4.6-1ubuntu1.1 (cgi-fcgi)

Put this in a file (test.php):

<?php
declare(ENCODING = 'utf-8') ;

//ä ü ö

?>

Run php test.php > test.txt

In text.txt you can find some unreadable characters.

This always occurs if there are special chars in any COMMENTS in the PHP file.
 [2013-02-17 06:58 UTC] wynn dot chen dot cn at gmail dot com
it's NOT a bug.

after turn on zend.multibyte, just need to enable mbstring extension, and in php.ini, set:

mbstring.internal_encoding = utf-8

now everything works fine.

test in php-5.4.10, windows2k8.

i guess that php convert php-script from declared encoding to mbstring internal_encoding first, and then try to execute the script. and default internal_encoding is something like ASCII, so this problem appears.
 [2015-08-27 22:03 UTC] cmb@php.net
-Type: Bug +Type: Documentation Problem -Package: Unicode Engine related +Package: *Configuration Issues -Assigned To: +Assigned To: cmb
 [2015-08-27 22:03 UTC] cmb@php.net
Indeed, that's not a bug. And yes, there is an internal
transliteration from the declared encoding to
mbstring.internal_encoding happening what has to be documented.
Therefore I'm changing to doc bug.
 [2015-08-27 22:08 UTC] cmb@php.net
Automatic comment from SVN on behalf of cmb
Revision: http://svn.php.net/viewvc/?view=revision&amp;revision=337654
Log: documented zend.script_encoding transliteration (fixes #62351)
 [2015-08-27 22:08 UTC] cmb@php.net
-Status: Assigned +Status: Closed
 [2015-08-27 22:08 UTC] cmb@php.net
This bug has been fixed in the documentation's XML sources. Since the
online and downloadable versions of the documentation need some time
to get updated, we would like to ask you to be a bit patient.

Thank you for the report, and for helping us make our documentation better.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Sat Nov 23 07:01:29 2024 UTC