|  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #66496 conversion of UTF-8 strings containing language specific chars is wrong
Submitted: 2014-01-16 14:43 UTC Modified: 2014-04-29 11:51 UTC
From: care at novadys dot de Assigned:
Status: Duplicate Package: COM related
PHP Version: 5.5.8 OS: Windows
Private report: No CVE-ID: None
 [2014-01-16 14:43 UTC] care at novadys dot de
Using a UTF-8 String as input for a com-function will generate a wrong string in COM interface. If the input string is containing n-chars which are encoded with 2 bytes, the length of resulting string is n byte too long. 

example: "I want to Dusseldorf and Koln" is correctly handled
"I want to Düsseldorf and Köln" will call COM function with a string:
"I want to Düsseldorf and Köln\0\4"  



PHP_COM_DOTNET_API void php_com_variant_from_zval(VARIANT *v, zval *z, int codepage TSRMLS_DC)


                        V_VT(v) = VT_BSTR;
                        olestring = php_com_string_to_olestring(Z_STRVAL_P(z), Z_STRLEN_P(z), codepage TSRMLS_CC);
here is the problem:

V_BSTR(v) = SysAllocStringByteLen((char*)olestring, Z_STRLEN_P(z) * sizeof(OLECHAR));

When input string is UTF-8 encoded Z_STRLEN_P(z) has a count of 2 byte for each "special" char. So length of input string is count of all chars + count of "special" chars. After conversion to olestring Z_STRLEN_P(z) is the wrong length the olestring is shorter. So SysAllocStringByteLen is allocating too much memory and while the for the string length of olestring. In result those strings are always showing a '\0' for the first special char + n-1 random chars for the remaing special chars.


Add a Patch

Pull Requests

Add a Pull Request


AllCommentsChangesGit/SVN commitsRelated reports
 [2014-01-16 20:28 UTC]
-Status: Open +Status: Feedback
 [2014-01-16 20:28 UTC]
Were it possible you to post a repro snippet? Thanks.
 [2014-01-17 14:56 UTC] care at novadys dot de
-Status: Feedback +Status: Open
 [2014-01-17 14:56 UTC] care at novadys dot de
It works correctly when replacing the line

V_BSTR(v) = SysAllocStringByteLen((char*)olestring, Z_STRLEN_P(z) * sizeof(OLECHAR));


V_BSTR(v) = SysAllocStringByteLen((char*)olestring, wcslen(olestring) * sizeof(OLECHAR));
 [2014-01-17 19:52 UTC]
-Status: Open +Status: Feedback
 [2014-01-17 19:52 UTC]
We need a code snippet in PHP to fix and write a test.

 [2014-04-29 11:51 UTC]
-Status: Feedback +Status: Duplicate
 [2014-04-29 11:51 UTC]
This is fixed now as it's the same as bug #66431, please check.

PHP Copyright © 2001-2021 The PHP Group
All rights reserved.
Last updated: Tue Jan 19 15:01:23 2021 UTC