php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #63316 Files declared as UTF-8 will append NUL bytes to output for non ASCII character
Submitted: 2012-10-19 21:56 UTC Modified: 2016-07-25 16:56 UTC
Votes:14
Avg. Score:4.4 ± 0.9
Reproduced:7 of 10 (70.0%)
Same Version:5 (71.4%)
Same OS:3 (42.9%)
From: post at wickenrode dot com Assigned: cmb (profile)
Status: Not a bug Package: Unicode Engine related
PHP Version: 5.4.8 OS: Xubuntu 12.10 amd64
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: post at wickenrode dot com
New email:
PHP Version: OS:

 

 [2012-10-19 21:56 UTC] post at wickenrode dot com
Description:
------------
Take the test script below and save it as UTF-8 and verify:

> cat test.php | tail -n 4 | head -n 1 | hexdump -C
00000000  09 23 20 c3 a4 c3 b6 0a                           |.# .....|
00000008

Now run

> ./bin/php test.php | hexdump -C
PHP Version = 5.4.8
zend.multibyte = 1
00000000  58 00 00 00                                       |X...|
00000004

As you can see PHP added three NUL bytes to the file.
Now remove one of the characters in the commented out line:

> ./bin/php test.php | hexdump -C
PHP Version = 5.4.8
zend.multibyte = 1
00000000  58 00 00                                          |X..|
00000003

As you can see the NUL bytes are directly related to the amount of non-ASCII characters in the file.

Now comment out the declare statement:

> ./bin/php test.php | hexdump -C
PHP Version = 5.4.8
zend.multibyte = 1
00000000  58                                                |X|
00000001

This time the output is correct.

Tested with off-the-shelf PHP, but also happens with ubuntu packaged 5.4.6-1
Also happends when PHP is running as Apache module where this issue caused corrupted images for me.

Test script:
---------------
<?php
	declare(encoding='UTF-8');

	fwrite(STDERR,"PHP Version = ".phpversion()."\n");
	fwrite(STDERR,"zend.multibyte = ".ini_get('zend.multibyte')."\n");

	# äöü
	echo "X";

?>


Expected result:
----------------
See above.
UTF-8 declared files with UTF-8 content should work fine.

Actual result:
--------------
See above.
UTF-8 declared files with UTF-8 content produce NUL bytes in the output.

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2013-02-17 07:16 UTC] wynn dot chen dot cn at gmail dot com
i think it's not a bug, just something like #62351.
set 
  mbstring.internal_encoding = utf-8
in php.ini, and then everything works fine.
 [2016-07-25 16:56 UTC] cmb@php.net
-Status: Open +Status: Not a bug -Assigned To: +Assigned To: cmb
 [2016-07-25 16:56 UTC] cmb@php.net
> set 
>  mbstring.internal_encoding = utf-8
> in php.ini, and then everything works fine.

That. See also
<http://php.net/manual/en/ini.core.php#ini.zend.script-encoding>.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 16:01:29 2024 UTC