php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #81598 Cannot input unicode characters in PHP 8 interactive shell
Submitted: 2021-11-08 12:15 UTC Modified: 2021-11-09 10:39 UTC
Votes:1
Avg. Score:5.0 ± 0.0
Reproduced:1 of 1 (100.0%)
Same Version:1 (100.0%)
Same OS:0 (0.0%)
From: 7snovic at gmail dot com Assigned:
Status: Closed Package: CGI/CLI related
PHP Version: 8.0Git-2021-11-08 (snap) OS: Ubuntu 20
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: 7snovic at gmail dot com
New email:
PHP Version: OS:

 

 [2021-11-08 12:15 UTC] 7snovic at gmail dot com
Description:
------------
When trying to use PHP in the interactive mode in PHP 8 you can not type any non-ASCII character like Arabic, Chinese and other languages.

$ php8.0 -a
echo "اهلا";

will copy echo ""; only into your terminal.

$ php7.4 -a
echo "اهلا";

will copy "اهلا"; into your terminal.

another references: https://stackoverflow.com/questions/69882515/cannot-input-unicode-characters-in-php-8-interactive-shell

https://stackoverflow.com/questions/69750044/php-8-shell-cant-input-non-ascii-chars


Patches

Pull Requests

Pull requests:

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2021-11-08 14:12 UTC] cmb@php.net
-Status: Open +Status: Feedback -Assigned To: +Assigned To: cmb
 [2021-11-08 14:12 UTC] cmb@php.net
I'm assuming that this is about the interactive *shell* (where you
have a command prompt).  If so, do you use the same readline
implementation (`echo READLINE_LIB;`) with both PHP versions?
 [2021-11-08 14:45 UTC] 7snovic at gmail dot com
-Status: Feedback +Status: Assigned -Package: *Unicode Issues +Package: CGI/CLI related
 [2021-11-08 14:45 UTC] 7snovic at gmail dot com
Update the package
 [2021-11-08 14:49 UTC] 7snovic at gmail dot com
Hello cmb,

HYG the output for the READLINE_LIB for PHP7.4, PHP8.0,and PHP8.1.

hassan@hassan:~$ php7.4 -r 'echo READLINE_LIB . "\n";'
libedit
hassan@hassan:~$ php8.0 -r 'echo READLINE_LIB . "\n";'
libedit
hassan@hassan:~$ php8.1 -r 'echo READLINE_LIB . "\n";'
libedit

Please note that the but is in both PHP8.0 & PHP8.1
 [2021-11-08 16:06 UTC] cmb@php.net
-Status: Assigned +Status: Open -Assigned To: cmb +Assigned To:
 [2021-11-08 16:06 UTC] cmb@php.net
Thanks for the swift reply!  I have no idea what might have
changed in PHP 8, and I'm usually working on Windows, I'll pass on
this one.
 [2021-11-08 16:06 UTC] nikic@php.net
-Status: Open +Status: Assigned -Assigned To: +Assigned To: cmb
 [2021-11-08 16:06 UTC] nikic@php.net
I can reproduce this. And I can also reproduce on 7.4 when running with "LANG=C php7.4 -a". The relevant change here is probably that PHP 8 no longer inherits the ctype locale from environment.
 [2021-11-08 16:07 UTC] nikic@php.net
-Status: Assigned +Status: Verified -Assigned To: cmb +Assigned To:
 [2021-11-08 16:07 UTC] nikic@php.net
(Undoing concurrent status changes)
 [2021-11-08 16:14 UTC] cmb@php.net
> The relevant change here is probably that PHP 8 no longer
> inherits the ctype locale from environment.

Ah, so this would be rather a documentation issue, telling users
to call setlocale() at the start of their session?
 [2021-11-08 16:28 UTC] nikic@php.net
From a quick test, calling setlocale() at the start of the session doesn't seem to work. It would be pretty bad UX as well.

Though that gives me a tiny bit of hope that maybe it's sufficient to have the locale set during some kind of startup procedure, and we can use the C locale afterwards.
 [2021-11-09 10:39 UTC] nikic@php.net
I looked into the editline implementation a bit (apparently available at https://thrysoee.dk/editline/ as a tar.gz file only -- no version control) and it seems to be based entirely around wchar functionality, which is indeed locale-based.

Something I tried and that seems to work is to use the "C.UTF-8" locale. I believe that should both make mbrtowc() work fine, and not introduce any locale-sensitivity in the single-byte functions used by PHP at the same time.
 [2021-11-09 11:07 UTC] nikic@php.net
The following pull request has been associated:

Patch Name: Fix bug #81598: Use C.UTF-8 as LC_CTYPE locale by default
On GitHub:  https://github.com/php/php-src/pull/7635
Patch:      https://github.com/php/php-src/pull/7635.patch
 [2021-12-05 20:04 UTC] git@php.net
Automatic comment on behalf of nikic
Revision: https://github.com/php/php-src/commit/26e424465c4d6848d19bcf041c6110ff155df840
Log: Fix bug #81598: Use C.UTF-8 as LC_CTYPE locale by default
 [2021-12-05 20:04 UTC] git@php.net
-Status: Verified +Status: Closed
 [2022-02-18 15:34 UTC] diessemail at gmail dot com
The bug has been fixed in php 8.1.2 (20 Jan 2022), but it's still present in latest php 8.0 (8.0.16, 17 Feb 2022).
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 14:01:29 2024 UTC