php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #81598 Cannot input unicode characters in PHP 8 interactive shell
Submitted: 2021-11-08 12:15 UTC Modified: 2021-11-09 10:39 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:0 (0.0%)
From: 7snovic at gmail dot com Assigned:
Status: Closed Package: CGI/CLI related
PHP Version: 8.0Git-2021-11-08 (snap) OS: Ubuntu 20
Private report: No CVE-ID: None
 [2021-11-08 12:15 UTC] 7snovic at gmail dot com
Description:
------------
When trying to use PHP in the interactive mode in PHP 8 you can not type any non-ASCII character like Arabic, Chinese and other languages.

$ php8.0 -a
echo "اهلا";

will copy echo ""; only into your terminal.

$ php7.4 -a
echo "اهلا";

will copy "اهلا"; into your terminal.

another references: https://stackoverflow.com/questions/69882515/cannot-input-unicode-characters-in-php-8-interactive-shell

https://stackoverflow.com/questions/69750044/php-8-shell-cant-input-non-ascii-chars


Patches

Add a Patch

Pull Requests

Pull requests:

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2021-11-08 14:12 UTC] cmb@php.net
-Status: Open +Status: Feedback -Assigned To: +Assigned To: cmb
 [2021-11-08 14:12 UTC] cmb@php.net
I'm assuming that this is about the interactive *shell* (where you
have a command prompt).  If so, do you use the same readline
implementation (`echo READLINE_LIB;`) with both PHP versions?
 [2021-11-08 14:45 UTC] 7snovic at gmail dot com
-Status: Feedback +Status: Assigned -Package: *Unicode Issues +Package: CGI/CLI related
 [2021-11-08 14:45 UTC] 7snovic at gmail dot com
Update the package
 [2021-11-08 14:49 UTC] 7snovic at gmail dot com
Hello cmb,

HYG the output for the READLINE_LIB for PHP7.4, PHP8.0,and PHP8.1.

hassan@hassan:~$ php7.4 -r 'echo READLINE_LIB . "\n";'
libedit
hassan@hassan:~$ php8.0 -r 'echo READLINE_LIB . "\n";'
libedit
hassan@hassan:~$ php8.1 -r 'echo READLINE_LIB . "\n";'
libedit

Please note that the but is in both PHP8.0 & PHP8.1
 [2021-11-08 16:06 UTC] cmb@php.net
-Status: Assigned +Status: Open -Assigned To: cmb +Assigned To:
 [2021-11-08 16:06 UTC] cmb@php.net
Thanks for the swift reply!  I have no idea what might have
changed in PHP 8, and I'm usually working on Windows, I'll pass on
this one.
 [2021-11-08 16:06 UTC] nikic@php.net
-Status: Open +Status: Assigned -Assigned To: +Assigned To: cmb
 [2021-11-08 16:06 UTC] nikic@php.net
I can reproduce this. And I can also reproduce on 7.4 when running with "LANG=C php7.4 -a". The relevant change here is probably that PHP 8 no longer inherits the ctype locale from environment.
 [2021-11-08 16:07 UTC] nikic@php.net
-Status: Assigned +Status: Verified -Assigned To: cmb +Assigned To:
 [2021-11-08 16:07 UTC] nikic@php.net
(Undoing concurrent status changes)
 [2021-11-08 16:14 UTC] cmb@php.net
> The relevant change here is probably that PHP 8 no longer
> inherits the ctype locale from environment.

Ah, so this would be rather a documentation issue, telling users
to call setlocale() at the start of their session?
 [2021-11-08 16:28 UTC] nikic@php.net
From a quick test, calling setlocale() at the start of the session doesn't seem to work. It would be pretty bad UX as well.

Though that gives me a tiny bit of hope that maybe it's sufficient to have the locale set during some kind of startup procedure, and we can use the C locale afterwards.
 [2021-11-09 10:39 UTC] nikic@php.net
I looked into the editline implementation a bit (apparently available at https://thrysoee.dk/editline/ as a tar.gz file only -- no version control) and it seems to be based entirely around wchar functionality, which is indeed locale-based.

Something I tried and that seems to work is to use the "C.UTF-8" locale. I believe that should both make mbrtowc() work fine, and not introduce any locale-sensitivity in the single-byte functions used by PHP at the same time.
 [2021-11-09 11:07 UTC] nikic@php.net
The following pull request has been associated:

Patch Name: Fix bug #81598: Use C.UTF-8 as LC_CTYPE locale by default
On GitHub:  https://github.com/php/php-src/pull/7635
Patch:      https://github.com/php/php-src/pull/7635.patch
 [2021-12-05 20:04 UTC] git@php.net
Automatic comment on behalf of nikic
Revision: https://github.com/php/php-src/commit/26e424465c4d6848d19bcf041c6110ff155df840
Log: Fix bug #81598: Use C.UTF-8 as LC_CTYPE locale by default
 [2021-12-05 20:04 UTC] git@php.net
-Status: Verified +Status: Closed
 [2022-02-18 15:34 UTC] diessemail at gmail dot com
The bug has been fixed in php 8.1.2 (20 Jan 2022), but it's still present in latest php 8.0 (8.0.16, 17 Feb 2022).
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Mar 28 16:01:29 2024 UTC