php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #69333 UTF8 output to console window is ... interesting!
Submitted: 2015-03-30 12:54 UTC Modified: 2016-08-08 09:51 UTC
Votes:2
Avg. Score:4.0 ± 1.0
Reproduced:2 of 2 (100.0%)
Same Version:0 (0.0%)
Same OS:1 (50.0%)
From: RQuadling at GMail dot com Assigned: ab (profile)
Status: Closed Package: CGI/CLI related
PHP Version: 5.4.39 OS: Windows
Private report: No CVE-ID: None
 [2015-03-30 12:54 UTC] RQuadling at GMail dot com
Description:
------------
Attempting to display UTF-8 content in a Windows console (having used chcp 65001 to allow UTF-8 content to be displayed)

2 issues are found.

1 - php -r does not read UTF-8 from the commandline.

php -r "echo 'página';"

fails with

p�gina


2 - echo-ing utf-8 content adds an extra line.

Running the script ...

<?php echo echo 'página';

outputs página correctly, but with an extra line on the display.

But, if you capture the output, there is no extra line, which suggests to me that PHP is calling a function that outputs the extra line in some way.

Test script:
---------------
This is a batch file that demonstrates that UTF-8 output is possible in some circumstances.

https://gist.github.com/RQuadling/068ac672d97e2e3e6cfa

Expected result:
----------------
Lots of ...

====== [description]
página
==OK==



Actual result:
--------------
Some ...


====== [description]
página
==OK==

but also ...

====== Via php -r
p�gina
=????= **** NON UNICODE DISPLAYED ****

====== Via php -r, captured to a log file and displayed
p�gina
=????= **** NON UNICODE DISPLAYED ****

====== From a php script
página

=????= **** EXTRA LINE DISPLAYED ****

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2015-03-30 12:56 UTC] RQuadling at GMail dot com
I typo-ed an extra 'echo' in my bug report ...

The php script should be ...

<?php echo 'página';
 [2015-03-30 13:21 UTC] RQuadling at GMail dot com
And bug 3:

Added var_dump() to the PHP mix.

In a PHP script, the var_dump() is WAY wrong!!! 7 characters (yay) but with an extra 'a' at the end!

But if output capture is being performed, then the output is correct. No extra 'a' !

I've updated the gist to properly enforce utf8 mode in the console - https://gist.github.com/RQuadling/068ac672d97e2e3e6cfa

I realised that I was already in codepage 65001. This didn't affect the bugs I am reporting.
 [2015-03-30 13:38 UTC] RQuadling at GMail dot com
More updates. And added a pipe through cywgin's hexdump to see what comes out of PHP. Seems the captured output is fine, but uncaptured, ... seems PHP is behaving differently in some way.
 [2016-08-08 09:51 UTC] ab@php.net
-Status: Open +Status: Closed -Assigned To: +Assigned To: ab
 [2016-08-08 09:51 UTC] ab@php.net
Fixed in PHP 7.1

Thanks.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Mon Nov 25 03:01:31 2024 UTC