php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Request #72768 Add ENABLE_VIRTUAL_TERMINAL_PROCESSING flag for php.exe
Submitted: 2016-08-05 16:38 UTC Modified: 2016-10-28 18:53 UTC
Votes:13
Avg. Score:4.3 ± 1.1
Reproduced:11 of 12 (91.7%)
Same Version:8 (72.7%)
Same OS:10 (90.9%)
From: mlocati at gmail dot com Assigned: ab (profile)
Status: Closed Package: Output Control
PHP Version: Irrelevant OS: Windows 10
Private report: No CVE-ID: None
 [2016-08-05 16:38 UTC] mlocati at gmail dot com
Description:
------------
The standard console of newer versions of Windows 10 add support to ANSI control codes.
BTW, every console executable should explicitly tell to use that new feature with the flag ENABLE_VIRTUAL_TERMINAL_PROCESSING.

For instance, at http://imgur.com/a/heYHm you see the the output of the application compiled from the code at the bottom.

What about adding this flag to php.exe?



#include <windows.h>
#include <stdio.h>

#ifndef ENABLE_VIRTUAL_TERMINAL_PROCESSING
#define ENABLE_VIRTUAL_TERMINAL_PROCESSING 0x0004
#endif

int main(void) {
  const char* sampleText = "\033[101;93m Yellow text on red background \033[0m\n";
  HANDLE hStdout;
  DWORD handleMode;

  printf(sampleText);

  hStdout = GetStdHandle(STD_OUTPUT_HANDLE);
  GetConsoleMode(hStdout, &handleMode);
  handleMode |= ENABLE_VIRTUAL_TERMINAL_PROCESSING;
  SetConsoleMode(hStdout, handleMode);
  printf(sampleText);

  return 0;
}



Patches

0001-Start-adding-VT100-support-for-Windows-v3 (last revision 2016-08-26 16:36 UTC) by mlocati at gmail dot com)
0001-Start-adding-VT100-support-for-Windows-v2 (last revision 2016-08-26 14:10 UTC) by mlocati at gmail dot com)
0001-Start-adding-VT100-support-for-Windows.patch (last revision 2016-08-25 13:05 UTC) by mlocati at gmail dot com)

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2016-08-05 22:19 UTC] kalle@php.net
-Status: Open +Status: Assigned -Assigned To: +Assigned To: ab
 [2016-08-05 22:19 UTC] kalle@php.net
It makes sense for me to add this flag to CLI, what do you think Anatol?
 [2016-08-08 06:29 UTC] mlocati at gmail dot com
Please note that the ENABLE_VIRTUAL_TERMINAL_PROCESSING flag does not affect the output of php.exe, but just the way the console interprets it.
For instance, if we assume that the sample c program mentioned in this #72768 is compiled as "test.exe", the contents of the file "test.txt" created with "test.exe >test.txt 2>&1" is
<ESC>[101;93m Yellow text on red background <ESC>[0m
<ESC>[101;93m Yellow text on red background <ESC>[0m

(where <ESC> is the character with ascii code \033 (octal))
 [2016-08-10 11:34 UTC] jhdxr@php.net
It's a good news that Windows finally supports this feature. If we add this flag to cli, I think we'd better add a constant indicating if this feature is enabled or not so userland can have their own choice.
 [2016-08-10 13:35 UTC] mlocati at gmail dot com
I just recompiled php.exe using the instructions described at http://git.php.net/?p=php-src.git;a=blob;f=sapi/cli/php_cli.c;h=dc92045ae7402ad16566a9024453820a26c69da8;hb=HEAD#l1174 (using the version tagged php-7.0.9)

With the current code, the output of
Release_TS\php.exe -r "echo ""\033[101;93m TEST \033[0m\n"";"
is
?[101;93m TEST ?[0m
(where ? is \033)

Adding the SetConsoleMode-related stuff described in this #72768, I get
 TEST
(with red background and yellow foreground)

So, I can confirm that the approach described here is valid.

Just one note: I called SetConsoleMode only for STD_OUTPUT_HANDLE, but STDERR seems to follow STDOUT:
Release_TS\php.exe -r "fwrite(STDERR, ""\033[101;93m TEST \033[0m\n"");"
gives the same colorized output as writing to STDOUT.
 [2016-08-10 13:38 UTC] mlocati at gmail dot com
Whoops, the link I mentioned is where I added my test code, the instructions I followed to compile php.exe are at https://wiki.php.net/internals/windows/stepbystepbuild
 [2016-08-10 13:50 UTC] cmb@php.net
> (where ? is \033)

The ESC character is likely being suppressed by Windows.
 [2016-08-10 14:06 UTC] mlocati at gmail dot com
In the Windows console it's represented by a left arrow as represented here: http://imgur.com/a/heYHm (at least with my codepage)

By the way, pasting that string in the browser caused that char to be suppressed, that's why I wrote "?"
 [2016-08-10 14:30 UTC] cmb@php.net
Yes, you're right; that's not a Windows issue, but rather the ESC
seems to be substituted by the bug tracker.

However, I just tried your command line, and I already get the
colored output (Windows 10.0.10586 cmd.exe). :-/ (I haven't
noticed that before, because I usually run ansicon, where that
happens anyway).
 [2016-08-10 14:44 UTC] mlocati at gmail dot com
Yes, before the Anniversary Update, console apps didn't require that flag.

But with Windows 10.0.14393 we need to set ENABLE_VIRTUAL_TERMINAL_PROCESSING.
 [2016-08-10 19:40 UTC] kalle@php.net
-Assigned To: ab +Assigned To: kalle
 [2016-08-10 22:31 UTC] ab@php.net
@mlocati, have you checked the approach with 65001 and other multibyte codepages? Some good tests is something we'll need. I'd suggest to integrate with the streams, and not enable by default.

Btw i'm on 14393.51 and running your cmd line "echo ""\033[101;93m TEST \033[0m\n"";" shows me a yellow word TEST on the red background. So it doesn't look like something is changed in the latest version. Seems it's already enabled by default, or it's not the vt100.

Thanks.
 [2016-08-10 23:11 UTC] kalle@php.net
@ab, well afaik we do not have the ability to define console modes on Windows, I don't see why we could not enable it by default for those who may not have it for whatever reason on Windows 10+.

Unless you plan on exposing more of the Microsoft Console functions to userland along with the codepage improvements?
 [2016-08-11 00:39 UTC] ab@php.net
Forget my last comment about seeing it in 14393.51, cmd started from ConEmu which seems to have used its hooks. I might be err though, but its strange so far. Also verified with build 10240 - looks like neither a C program nor PHP have vt100 support on pure cmd.exe. But cmd.exe is nice so far on 14393.51, can be check on pure cmd http://conemu.github.io/en/AnsiEscapeCodes.html#ANSI_and_xterm_color_maps . AnsiColors16.ans seems to be supported good so far.

Kalle, it is a compatibility question. The UTF-8/long path topic doesn't cross with this one. It just forced the cmd.exe codepage to mimic the default_charset. That's why i ask about some tests with multibyte codepages and especially with 65001. There can be weird issues with some codepages like fe #72555.

In the vt100 case, it is something quite new and not absolutely required. BC with older systems needs to be ensured. At least with those lower win10 build 10240, from what i can say after the short research today. Also various codepages, as mentioned, and other usage scenarios like file I/O, pipes, etc. Integrating with streams, fe as a stream wrapper or maybe per stream_context_set_option() is a possible safer way to go. Scripts utilizing VT100 functionality will be a new development anyway, so that can be incapsulated good. A direct enablement will need all the check work as well. But i don't think exposing all of SetConsoleMode is really required in the core. This might be an idea for some extension ,however. 

Thanks.
 [2016-08-11 03:07 UTC] kalle@php.net
-Status: Assigned +Status: Analyzed -Assigned To: kalle +Assigned To:
 [2016-08-11 03:07 UTC] kalle@php.net
Anatol, that is a very valid point, and it is indeed perhaps better we encapsulate this functionality into its own, Windows specific, extension.

Gonna leave this open for now then
 [2016-08-11 09:39 UTC] mlocati at gmail dot com
I asked to Microsoft more info about the state of the ENABLE_VIRTUAL_TERMINAL_PROCESSING flag in cmd.exe (see https://wpdev.uservoice.com/forums/266908-command-prompt-console-bash-on-ubuntu-on-windo/suggestions/15617610--re-enable-enable-virtual-terminal-processing-by ).

They confirmed that the ANSI control codes (that are described here https://msdn.microsoft.com/it-it/library/windows/desktop/mt638032(v=vs.85).aspx ) where enabled by mistake on Windows 10.0.10586, but for the later versions the applications needs to explicitly enable them (with SetConsoleMode).

About problems with UTF-8: there isn't any problem. There's no character representation that contains the \x27 (octal \033) byte except for the ESC character used in the ANSI control codes.

Indeed, all the bytes whose most significant bit (MSB) is 0 (like \x27 == 00100111 in binary notation) are not part of any multibyte character representation: in utf-8 all the multibyte characters have bytes whose MSB is 1.

About enabling or not by default value this value, I'd prefer a behavior as near as possible to what happens on Linux.
On a freshly installed Ubuntu, bash interprets the ANSI control codes.
For instance, if you run
php -r 'echo "\033[101;93m TEST \033[0m\n";'
- on a color-enabled terminal you'll see a colored " TEST "
- on a color-disabledterminal you'll see a not-colored " TEST " but without the control codes (ie you won't see "?[101;93m TEST ?[0m")

A final note: IMHO it would be great to to control this flag at runtime (with ini_set()?)
 [2016-08-11 16:07 UTC] ab@php.net
@mlocati, nice research so far! It's unlikely a solution with INI will be accepted to control the terminal support. Probably one more plus to go by streams.

The difference between bash and cmd in this case is, that on older Windows the control sequences will be output as is. So far i've seen on build 10240. As if a shell is not aware there's VT at all, it just won't strip the control sequences. On the other hand - outputs by old scripts might be unintentionally misinterpreted on the new systems. Win7 is going to be in use yet good 5 years, win8 even longer. Even VT isn't supported there, at least no breakage there should happen. To make it same as Linux, PHP might care about stripping the controls, probably no way around.

It's already good that UTF-8 issues are not supposed to be. We need to ensure others like SJIS, EUC-JP, GB2312, BIG5, etc. are fine. I've seen that some conflicting ranges are present in some older IBM codepages, but that's most likely not something one needs to care about. Besides the encoding issues, there might be font issues like mentioned before.

Other questions about what happens when redirecting output, or piping to another program, or reading from pipe, need to be cleared out and tested as well. Again, one point could be to say - just ignore the old stuff, or it needs to be handled in PHP. If VT is not enabled explicitly, likely some automatic will be needed to handle BC and questionable cases.

Having control over this feature in the script space might be handier to write code compatible with older systems. For example, I/O can be wrapped with a userspace code to disable or enable the control sequences depending on the term/system version. That could significatly simplify the C implementation. Otherwise, the C implementation should be smart enough to handle what is needed. But in any case, we'd need tests to all the points for codepage, I/O, etc. that can be run on various system configurations. If someone runs for an implementation, I were interested in this as well and provide reviews and testing. But not right now, as the current focus is on stabilizing the 7.1 features.

Thanks.
 [2016-08-18 15:10 UTC] mlocati at gmail dot com
@ab
I fully understand that PHP scripts that output control codes on older systems (or when the ENABLE_VIRTUAL_TERMINAL_PROCESSING flag is not set for the current process) is really bad.
That's why I suggested to use ini_get/ini_set.

For instance, if the key used to control if the system has color support is 'win_vt100', a script could do something like this:

<php
ini_set('win_vt100', true); // returns true on success, false on failure
if (ini_get('win_vt100')) {
    // print with ANSI control codes
} else {
    // print without ANSI control codes
}
?>

About potential problems with the multibyte encodings: I think it's cmd.exe that takes care of it. BTW, to solve any problem and to be backward compatibile, vt100 could be disabled by default, and enabled only by the scripts that wants it.
By using the ini_get/ini_set approach, people that wants ANSI codes enabled by default could set its value to true in their php.ini file.


PS: I created a gist with a sample C code that implements the following functions:
- check if the current console may have has colors
- determine if the current console has colors
- enable/disable color support for the current console
You can find it here: https://gist.github.com/mlocati/21a9233ac83f7d3d7837535bc109b3b7
 [2016-08-19 08:27 UTC] mlocati at gmail dot com
Strictly related to this issue: https://bugs.php.net/bug.php?id=72896
 [2016-08-21 22:41 UTC] ab@php.net
@mlocati, amazing work, thanks! Your code can already be used as base framework to support the PHP integration.

The issue with a new ini is, that there are already quite a few. That's the reason it's preferable to not to introduce a new one, especially if it's solvable another way. Regarding streams, my thought was like piece of code below:

$fd = fopen("php://stdout", "r");
stream_context_set_option($fd, "stdio", "vt100_enabled", true);
fwrite($fd, "\033[101;93m Yellow text on red background \033[0m\n");
fclose($fd);

Same with php/stdin, etc. Any such stream is an isolated version of the original system descriptor, it won't affect all the output. OFC in fact, it might have to be a singleton, as it's only one I/O device at the end. But, as a stream is buffered, the decision whether to out/read the control sequences can be deferred.It could be cool ofc, to be able doing the same with the std I/O constants, fe like

stream_context_set_option(STDOUT, "stdio", "vt100_enabled", true);

but i'm not sure it can work exactly that way, but need to double check. Still, it could be an additional user land stream function to deliver the terminal info or switch to the required mode. Maybe a function were even more universal in that case, like stream_vt100_supported(STDOUT), etc.

As it looks like after thinking a bit more, stripping the control sequences might be relatively easy. Should research more yet, but the sequences i've seen follow a particular pattern. Especially in the case of the direct I/O, one could just scroll over them. Even with the buffered stream, that might be not a big overhead. So probably shouldn't completely abandon that option. Another nice thing in streams could be a stream filter, which could be used for this task. 

The point with the mb encodings - I'd not write it off yet. We'll have to test and do some adjustments under circumstances. Fe for xterm, there are several dedicated versions for Asian encodings, the question is just why.

One point with your latest test code - to check were whether it'd fail for a non standard terminal. There are nice alternative terminals like ConEmu, which already provide the functionality even on lower Windows versions. I'm not sure yet, how to solve this without checking any possible APIs. This would probably prevent the automation in some case, but is probably an overhead ATM and can be checked at some later stage. It's ATM only about the standard cmd.exe, anyway.

It is already a very good start. If you've mood and time, you could already begin the PHP patch. Maybe you've a better idea how to integrate it with PHP, please let me know. Writing the tests for the prospective usage, then matching with internals to correct/improve the actual implementation might be next step.

Thanks.
 [2016-08-22 07:02 UTC] mlocati at gmail dot com
@ab I don't fully understand the point about stripping the control sequences...

Given that

1. PHP developers have a way to know if STDOUT/STDERR support control sequences (#72768)

2. PHP developers have a way to know if STDOUT/STDERR are not redirected to a file (#72896)

they may decide to send the control sequences or not.

On the other hand, even if the PHP developers know that the output does not support control sequences, they may need to output them. I don't see a reason why they would want to do that, but I think developers should be as free as they want.

Furthermore, stripping out chars could be risky and requires a deep study.
 [2016-08-22 09:34 UTC] ab@php.net
@mlocati, I only mentioned the control sequences stripping as a result of further deliberation. Bash does it, as you mentioned. ASCII (compatible) would be probably easy to do, but not sure with double byte and other mb encodings. So mentioned it, just to keep in mind, this option is possible. Otherwise - yeah, what we discuss till now as an initial plan is to not to strip control sequences automatically.

Thanks.
 [2016-08-25 13:09 UTC] mlocati at gmail dot com
I tried to add this new function, and since this is my first attempt to contribute to PHP I'm surely doing something wrong: the compilation fails with strange messages (redefinitions of #define, structs, functions).
 [2016-08-26 14:14 UTC] mlocati at gmail dot com
I finally managed to compile php (basically an include of php.h was missing) - see attached patch 0001-Start-adding-VT100-support-for-Windows-v2.

Just one thing remains to be done: how to get the standard Windows handle (eg STD_INPUT_HANDLE/STD_OUTPUT_HANDLE/STD_ERROR_HANDLE) starting from a php_stream?
I thought it was possible by inspecting stream->orig_path, but it's not the case.

Any hint?
 [2016-08-26 16:38 UTC] mlocati at gmail dot com
I changed the stream_vt100_support function to accept strings ('php://stdout', 'php://stderr') instead of stream objects.

This is a bit a workaround, but I really don't know how to determine the standard stream (stdin/stdout/stderr) from stream objects.

See patch 0001-Start-adding-VT100-support-for-Windows-v3
 [2016-08-28 18:12 UTC] mlocati at gmail dot com
What about continuing this discussion on a new pull request at https://github.com/php/php-src ?
 [2016-08-29 14:55 UTC] ab@php.net
@mlocati, thanks for all the work so far.

I made a quick look over your latest patch so far, a couple of comments already.

Please check the coding style doc for function naming conventions, etc.
http://git.php.net/?p=php-src.git;a=blob;f=CODING_STANDARDS;h=5cf70c92b5f5ab06977629ba6fff87255cc80116;hb=HEAD . Particularly, in most casse the internal APIs should be prefixed with php_*, and Windows specific with php_win32_*. Also the underscore is used for separation, etc. Please see other sources there.

Also the following regarding the code:

- the Unicode APIs have to be used, where it matters in 7.1+. Fe GetFinalPathNameByHandleW. Please check the corresponding helper routines in win32/ioutil.h.
- usually we don't use the driver routines, Rtl*, etc. Regarding getting the version, there's quite some functionality already in the core, please check EG(windows_version_info). It should suffice as till now the only case is the usage after MINIT is bypassed.
- please don't use static vars in functions, until it's thread safe
- for the streams, particularly main/streams/plain_wrapper.c were relevant for STDIO. Taking some stream and stepping through it in the debugger might help for better understanding. Basically, a stream resource needs to be passed, as the fd might be duped but still point to a vt100 term.
- tests are required :)

Indeed, it might be handier to discuss the patch in a github PR, which you already can attach to this ticket.

Thanks.
 [2016-08-30 08:24 UTC] mlocati at gmail dot com
Here's the PR on GitHub: https://github.com/php/php-src/pull/2103
 [2016-10-28 18:53 UTC] ab@php.net
-Status: Analyzed +Status: Closed -Assigned To: +Assigned To: ab
 [2016-10-28 18:53 UTC] ab@php.net
The PR is merged into master.

Thanks.
 
PHP Copyright © 2001-2017 The PHP Group
All rights reserved.
Last updated: Sun Nov 19 01:31:42 2017 UTC