php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #74777 automatically convert all strings given in php script file
Submitted: 2017-06-18 16:26 UTC Modified: 2017-06-18 23:10 UTC
From: qdinar at gmail dot com Assigned:
Status: Not a bug Package: *General Issues
PHP Version: 5.6.30 OS: windows 10
Private report: No CVE-ID: None
View Add Comment Developer Edit
Welcome! If you don't have a Git account, you can't do anything here.
You can add a comment by following this link or if you reported this bug, you can edit this bug over here.
(description)
Block user comment
Status: Assign to:
Package:
Bug Type:
Summary:
From: qdinar at gmail dot com
New email:
PHP Version: OS:

 

 [2017-06-18 16:26 UTC] qdinar at gmail dot com
Description:
------------
i use one-byte encoding for script's calculations, and want to provide one-byte strings, but i want to write php code itself in utf-8 because i use non-windows-1251 characters in comments.

i have tried "declare(encoding='utf-8');" for that but seems it does not work that way, see example below.

i tried that because it is written at http://php.net/manual/en/ini.core.php#ini.zend.script-encoding :

" Literal strings will be transliterated from zend.script_enconding to mbstring.internal_encoding, as if mb_convert_encoding() would have been called. "

and i thought that maybe " declare(encoding='utf-8'); " sets same "zend.script_enconding" but just for a separate file. (they are somehow related, as i have seen from docs). i do not want to have set it to all files.

Test script:
---------------
php script file in utf-8 encoding:
<?php
declare(encoding='utf-8');
mb_internal_encoding('windows-1251');
file_put_contents( 'declare_encoding.txt' , "сэлэм");
?>

Expected result:
----------------
"сэлэм" written with windows-1251

Actual result:
--------------
"сэлэм" written with utf-8

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2017-06-18 16:39 UTC] requinix@php.net
-Status: Open +Status: Not a bug -Type: Feature/Change Request +Type: Bug
 [2017-06-18 16:39 UTC] requinix@php.net
The string is converted at compile time. That means mbstring.internal_encoding has to be set to the desired encoding before the file is even loaded.

<?php // first.php
mb_internal_encoding('windows-1251'); // or set in php.ini
include 'second.php';
?>

<?php // second.php
declare(encoding='utf-8');
file_put_contents('declare_encoding.txt', 'сэлэм');
?>
 [2017-06-18 20:46 UTC] qdinar at gmail dot com
related bug: https://bugs.php.net/bug.php?id=68284 "declare encoding docs unclear / encoding handling needs better docs"
 [2017-06-18 20:57 UTC] qdinar at gmail dot com
requinix, i have tried that, with php script including other, it has not worked: output is still in utf-8. have you tried it?
 [2017-06-18 23:03 UTC] requinix@php.net
I did, but what I posted is not technically what I tried and I then made a couple assumptions that weren't true.

I'm working on creating a new bug report about problems in file conversion with regards to mbstring.internal_encoding, mb_internal_encoding(), default_charset, and internal_encoding, however I'm still trying to understand exactly what does and does not work (and why) and it's taking longer than I thought.

I considered this bug report to be about how to use zend.multibyte correctly, and after looking through the documentation I see it talks about the mbstring.internal_encoding setting - not mb_internal_encoding(). Including what I said before about changing the encoding before the file is parsed, this revised code

<?php // first.php
@ini_set('mbstring.internal_encoding', 'windows-1251');
include 'second.php';
?>

<?php // second.php saved in UTF-8
declare(encoding = 'utf-8');
file_put_contents('declare_encoding.txt', 'сэлэм');
?>

should - and does - work correctly in PHP 7. (It should work in PHP 5.6 too but that series is no longer in active support.)
 [2017-06-18 23:10 UTC] nikic@php.net
IIRC there's also an issue where you cannot use internal_encoding, but have to use the deprecated mbstring.internal_encoding :/
 [2017-06-19 11:51 UTC] qdinar at gmail dot com
>@ini_set('mbstring.internal_encoding', 'windows-1251');

it works in php 5.6.30.
thank you.
 
PHP Copyright © 2001-2021 The PHP Group
All rights reserved.
Last updated: Wed Oct 20 11:03:34 2021 UTC