php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #77993 Wrong parse error for invalid hex literal on Windows
Submitted: 2019-05-08 18:45 UTC Modified: 2019-05-08 22:35 UTC
From: theodorejb at outlook dot com Assigned:
Status: Closed Package: Scripting Engine problem
PHP Version: 7.3.5 OS: Windows 10
Private report: No CVE-ID: None
 [2019-05-08 18:45 UTC] theodorejb at outlook dot com
Description:
------------
If a hex literal has an underscore (or other invalid character) between the "0x" and the digits, the error on Windows doesn't match the expected error. The error is correct on Linux, but not on the Windows version of PHP.

Strangely, Windows produces the expected error (matching Linux) for binary literals, but not hex literals.

Test script:
---------------
0x_10;

Expected result:
----------------
Parse error: syntax error, unexpected 'x_10' (T_STRING) in %s on line %d

Actual result:
--------------
Parse error: Invalid numeric literal in %s on line %d

Patches

Add a Patch

Pull Requests

Pull requests:

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2019-05-08 22:35 UTC] cmb@php.net
-Status: Open +Status: Analyzed -Package: *Compile Issues +Package: Scripting Engine problem
 [2019-05-08 22:35 UTC] cmb@php.net
The leading "0" is generally recognized as LNUM[1].  On 64bit
Windows _strtoi64()[2] is used to parse the LNUM, while on 64bit
POSIX strtoll()[3] is used.  strtoll() accepts the "0" and sets
the endptr to "x", while _strtoi64() doesn't accept the "0" and
sets the endptr to "0".  Assuming that the "0x" case is the only
potential issue we could likely hack-around with something like:


 Zend/zend_language_scanner.l | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Zend/zend_language_scanner.l b/Zend/zend_language_scanner.l
index 837df416e2..6cfc5804b0 100644
--- a/Zend/zend_language_scanner.l
+++ b/Zend/zend_language_scanner.l
@@ -1649,6 +1649,9 @@ NEWLINE ("\r"|"\n"|"\r\n")
 	if (yyleng < MAX_LENGTH_OF_LONG - 1) { /* Won't overflow */
 		errno = 0;
 		ZVAL_LONG(zendlval, ZEND_STRTOL(yytext, &end, 0));
+		if (end == yytext) {
+			end++;
+		}
 		/* This isn't an assert, we need to ensure 019 isn't valid octal
 		 * Because the lexing itself doesn't do that for us
 		 */


However, a cleaner solution would probably be to actually
zero-terminate the recognized LNUM before parsing it, e.g.


 Zend/zend_language_scanner.l | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Zend/zend_language_scanner.l b/Zend/zend_language_scanner.l
index 837df416e2..bc72cb7545 100644
--- a/Zend/zend_language_scanner.l
+++ b/Zend/zend_language_scanner.l
@@ -1647,8 +1647,11 @@ NEWLINE ("\r"|"\n"|"\r\n")
 <ST_IN_SCRIPTING>{LNUM} {
 	char *end;
 	if (yyleng < MAX_LENGTH_OF_LONG - 1) { /* Won't overflow */
+		char save = yytext[yyleng];
+		yytext[yyleng] = '\0';
 		errno = 0;
 		ZVAL_LONG(zendlval, ZEND_STRTOL(yytext, &end, 0));
+		yytext[yyleng] = save;
 		/* This isn't an assert, we need to ensure 019 isn't valid octal
 		 * Because the lexing itself doesn't do that for us
 		 */


[1] <https://github.com/php/php-src/blob/php-7.3.5/Zend/zend_language_scanner.l#L1246>
[2] <https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/strtoi64-wcstoi64-strtoi64-l-wcstoi64-l?view=vs-2019>
[3] <http://pubs.opengroup.org/onlinepubs/9699919799/>
 [2019-05-09 19:57 UTC]
The following pull request has been associated:

Patch Name: Fix #77993: Wrong parse error for invalid hex literal on Windows
On GitHub:  https://github.com/php/php-src/pull/4138
Patch:      https://github.com/php/php-src/pull/4138.patch
 [2019-05-13 09:07 UTC] nikic@php.net
Automatic comment on behalf of theodorejb@outlook.com
Revision: http://git.php.net/?p=php-src.git;a=commit;h=b6b15fc65cc7898bc1ea992ec607ddb7f94e8eb8
Log: Fix #77993: Wrong parse error for invalid hex literal on Windows
 [2019-05-13 09:07 UTC] nikic@php.net
-Status: Analyzed +Status: Closed
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Mar 28 14:01:29 2024 UTC