php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #75574 putenv does not work properly if parameter contains non-ASCII unicode character
Submitted: 2017-11-26 19:14 UTC Modified: 2017-11-29 08:24 UTC
From: ganlvtech at qq dot com Assigned: ab (profile)
Status: Closed Package: Scripting Engine problem
PHP Version: 7.1.12 nts Win32 VC14 x64 OS: Windows 10 (1709) zh-CN
Private report: No CVE-ID: None
View Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
If you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: ganlvtech at qq dot com
New email:
PHP Version: OS:

 

 [2017-11-26 19:14 UTC] ganlvtech at qq dot com
Description:
------------
If the value ends with odd numbers of Chinese characters, it returns false and getenv cannot get the environment variable just set.
If the value ends with even numbers of Chinese characters, it works well and returns true.
If the value ends with a ASCII character, although there are some Chinese characters in the string, it still works well.
If the Chinese characters is split by ASCII character, only the number of the last part of Chinese characters affects the result.

Japanese character also cause this problem.

On ubuntu 16.04.3 with php 7.0.22 everything is OK. So the bug may be only for Windows.

I have checked the php-src/ext/standard/basic_functions.c PHP_FUNCTION(putenv), but I'm sorry that I cannot find the solution.

This problem might be similar to the fixed bug #50690(putenv() does not assign values to env. vars when the value is one character), but this one is for non-ASCII character.

Thank you.

Test script:
---------------
<?php
// Chinese character '啊' === urldecode('%E5%95%8A')
var_dump(putenv('FOO=啊')); // false
// var_dump(putenv('FOO=' . urldecode('%E5%95%8A'))); // false
var_dump(putenv('FOO=啊啊')); // true
// var_dump(putenv('FOO=' . urldecode('%E5%95%8A%E5%95%8A'))); // true
var_dump(putenv('FOO=啊啊啊')); // false
var_dump(putenv('FOO=啊啊啊啊')); // true
// odd got false, even got true.
var_dump(putenv('FOO=啊a')); // true
var_dump(putenv('FOO=啊a啊')); // false
var_dump(putenv('FOO=啊a啊a')); // true
var_dump(putenv('FOO=啊a啊a啊')); // false
var_dump(putenv('FOO=啊a啊啊')); // true
var_dump(putenv('FOO=啊a啊啊啊')); // false
var_dump(putenv('FOO=啊a啊啊啊啊')); // true
// If the value ends with odd numbers of Chinese character, it returns false.
?>

Expected result:
----------------
bool(true)
bool(true)
bool(true)
bool(true)
bool(true)
bool(true)
bool(true)
bool(true)
bool(true)
bool(true)
bool(true)


Actual result:
--------------
bool(false)
bool(true)
bool(false)
bool(true)
bool(true)
bool(false)
bool(true)
bool(false)
bool(true)
bool(false)
bool(true)


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2017-11-26 21:04 UTC] ab@php.net
-Status: Open +Status: Feedback
 [2017-11-26 21:04 UTC] ab@php.net
Thanks for the report. Could you also tell whether your build is TS/NTS and what is your INI for default_charset and internal_encoding?

Thanks.
 [2017-11-26 21:07 UTC] ab@php.net
Ah, and the encoding of the file you've posted, too.

Thanks.
 [2017-11-27 03:56 UTC] ganlvtech at qq dot com
-Summary: putenv does not work properly if the parameter contains Chinese character +Summary: putenv does not work properly if parameter contains non-ASCII unicode character -Status: Feedback +Status: Open -PHP Version: 7.1.12 +PHP Version: 7.1.12 nts Win32 VC14 x64
 [2017-11-27 03:56 UTC] ganlvtech at qq dot com
Oh, if I change the encoding of the test script from UTF-8 to GBK (from 3 bytes per char to 2 bytes per char), it will be OK.

I don't know if there is any settings wrong with my Windows (Chinese simplyfied version). It seems that the system only accept non-ASCII char in 2 bytes (in another word, system use GBK encoding). System accept & return the value in GBK encoding, but php pass & retrieve env var in UTF-8 encoding.

I set an Environment variable in control panel, and retrieve the value in php by getenv. I got an GBK encoding byte string cannot read. Because php7 use UTF-8 as default ouput encoding. I tried

echo iconv('GBK', 'UTF-8', getenv('FOO'));

and got the expected value.

I think this bug may be the limitation of Windows platform, and may not be fixed unless Windows change the its default encoding to Unicode.


Thanks.


Another test script:
---------------
// Greek_alphabet 'α' === urldecode('%CE%B1')
var_dump(putenv('FOO=α')); // true


php.ini:
---------------
; PHP's default character set is set to UTF-8.
; http://php.net/default-charset
default_charset = "UTF-8"

; PHP internal character encoding is set to empty.
; If empty, default_charset is used.
; http://php.net/internal-encoding
;internal_encoding =


PHP Version: php-7.1.12-nts-Win32-VC14-x64
 [2017-11-27 04:15 UTC] ganlvtech at qq dot com
https://stackoverflow.com/questions/19008182/c11-and-win32-wchar-t

if php uses SetEnvironmentVariableA now, it can convert UTF8 to UTF16 and use SetEnvironmentVariableW.

I don't know if this is a solution of this problem.
 [2017-11-27 04:15 UTC] ganlvtech at qq dot com
https://stackoverflow.com/questions/19008182/c11-and-win32-wchar-t

if php uses SetEnvironmentVariableA now, it can convert UTF8 to UTF16 and use SetEnvironmentVariableW.

I don't know if this is a solution of this problem.
 [2017-11-27 10:18 UTC] ab@php.net
Thanks for more info. Yeah, i guess the issue is that we the A versions of the API are used. Getting a VM with Chinese simplified to check more.

Thanks.
 [2017-11-27 17:53 UTC] ab@php.net
Automatic comment on behalf of ab
Revision: http://git.php.net/?p=php-src.git;a=commit;h=2b7d283cc5589077d7ecc23cc9b2827759cab5c2
Log: Fixed bug #75574 putenv does not work properly if parameter contains non-ASCII unicode character
 [2017-11-27 17:53 UTC] ab@php.net
-Status: Open +Status: Closed
 [2017-11-28 06:05 UTC] ganlvtech at qq dot com
Tested the newest snapshot (7.1.13-dev nts).

Using test script from php-src/ext/standard/tests/general_functions/putenv_bug75574_utf8.phpt

Run 'php test.php', test passed.
Run 'php -S 127.0.0.1:8000', and visit http://127.0.0.1/test.php, test passed.
But when I run 'php-cgi -b 127.0.0.1:9000', and use nginx fastcgi_pass to php-cgi, putenv is OK, but getenv will return cp936 2-byte char instead of utf8 3-byte char.
(In fact, I start php-cgi with 'RunHiddenConsole php-cgi -b 127.0.0.1:9000')

I would test some more to find if there's some error in my setting.
 [2017-11-28 11:02 UTC] ganlvtech at qq dot com
Using php.exe got expected result, but using php-cgi.exe putenv did not sets vars with Unicode.

Test script:
---------------
<?php
var_dump(putenv('FOO=啊'));
var_dump(`echo %FOO%`); // There will be a "\n" in the result.
var_dump(getenv('FOO'));
?>

Expected result:
----------------
bool(true)
string(4) "啊
"
string(3) "啊"

Actual result:
--------------
bool(true)
string(3) "С
"
string(2) "С"


The length of string is not equal to each other. putenv put vars using cp936 instead of Unicode.
You can use another test script to view the difference.

Test script:
--------------
<?php
var_dump(putenv('FOO=啊'));
echo urlencode(`echo %FOO%`), PHP_EOL;
echo urlencode(getenv('FOO')), PHP_EOL;
?>

Expected result:
----------------
bool(true)
%E5%95%8A%0A
%E5%95%8A

Actual result:
--------------
bool(true)
%B0%A1%0A
%B0%A1


php snapshot: php-7.1-nts-windows-vc14-x64-r06202f0
 [2017-11-28 15:50 UTC] ab@php.net
-Status: Closed +Status: Re-Opened
 [2017-11-28 19:50 UTC] ab@php.net
-Status: Re-Opened +Status: Feedback
 [2017-11-28 19:50 UTC] ab@php.net
@ganlvtech i've pushed a fix regarding FCGI, please check as soon as the snapshots for 8b57a5bca0c5a9ba064aa27e7614d1a11c0cac6d or later are produced. Not sure about the ticks execution yet, probably should dig more on that.

Thanks.
 [2017-11-29 05:44 UTC] ganlvtech at qq dot com
-Status: Feedback +Status: Closed
 [2017-11-29 05:44 UTC] ganlvtech at qq dot com
php-7.1-nts-windows-vc14-x64-r8b57a5b
PHP 7.1.13-dev (cgi-fcgi) (built: Nov 28 2017 20:29:56)

Test passed. And the issue can be closed.

Thanks for your patience on this bug. And I'm sorry that bothering you for such a small bug.
 [2017-11-29 08:24 UTC] ab@php.net
-Assigned To: +Assigned To: ab
 [2017-11-29 08:24 UTC] ab@php.net
@ganlvtech, that's a totally fine thing, bugs are there to be fixed. Many thanks for the verification :)
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Mon Oct 07 10:01:28 2024 UTC