php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #66496 conversion of UTF-8 strings containing language specific chars is wrong
Submitted: 2014-01-16 14:43 UTC Modified: 2014-04-29 11:51 UTC
From: care at novadys dot de Assigned:
Status: Duplicate Package: COM related
PHP Version: 5.5.8 OS: Windows
Private report: No CVE-ID: None
 [2014-01-16 14:43 UTC] care at novadys dot de
Description:
------------
Using a UTF-8 String as input for a com-function will generate a wrong string in COM interface. If the input string is containing n-chars which are encoded with 2 bytes, the length of resulting string is n byte too long. 

example: "I want to Dusseldorf and Koln" is correctly handled
"I want to Düsseldorf and Köln" will call COM function with a string:
"I want to Düsseldorf and Köln\0\4"  

reason:

ext/com_dotnet/com_variant.c

...
PHP_COM_DOTNET_API void php_com_variant_from_zval(VARIANT *v, zval *z, int codepage TSRMLS_DC)

...

case IS_STRING:
                        V_VT(v) = VT_BSTR;
                        olestring = php_com_string_to_olestring(Z_STRVAL_P(z), Z_STRLEN_P(z), codepage TSRMLS_CC);
                        
here is the problem:

V_BSTR(v) = SysAllocStringByteLen((char*)olestring, Z_STRLEN_P(z) * sizeof(OLECHAR));

When input string is UTF-8 encoded Z_STRLEN_P(z) has a count of 2 byte for each "special" char. So length of input string is count of all chars + count of "special" chars. After conversion to olestring Z_STRLEN_P(z) is the wrong length the olestring is shorter. So SysAllocStringByteLen is allocating too much memory and while the for the string length of olestring. In result those strings are always showing a '\0' for the first special char + n-1 random chars for the remaing special chars.





Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2014-01-16 20:28 UTC] ab@php.net
-Status: Open +Status: Feedback
 [2014-01-16 20:28 UTC] ab@php.net
Were it possible you to post a repro snippet? Thanks.
 [2014-01-17 14:56 UTC] care at novadys dot de
-Status: Feedback +Status: Open
 [2014-01-17 14:56 UTC] care at novadys dot de
It works correctly when replacing the line

V_BSTR(v) = SysAllocStringByteLen((char*)olestring, Z_STRLEN_P(z) * sizeof(OLECHAR));

by

V_BSTR(v) = SysAllocStringByteLen((char*)olestring, wcslen(olestring) * sizeof(OLECHAR));
 [2014-01-17 19:52 UTC] ab@php.net
-Status: Open +Status: Feedback
 [2014-01-17 19:52 UTC] ab@php.net
We need a code snippet in PHP to fix and write a test.

Thanks.
 [2014-04-29 11:51 UTC] ab@php.net
-Status: Feedback +Status: Duplicate
 [2014-04-29 11:51 UTC] ab@php.net
This is fixed now as it's the same as bug #66431, please check.

Thanks.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 21 12:01:29 2024 UTC