PHP :: Bug #66496 :: conversion of UTF-8 strings containing language specific chars is wrong

Bug #66496	conversion of UTF-8 strings containing language specific chars is wrong
Submitted:	2014-01-16 14:43 UTC	Modified:	2014-04-29 11:51 UTC
From:	care at novadys dot de	Assigned:
Status:	Duplicate	Package:	COM related
PHP Version:	5.5.8	OS:	Windows
Private report:	No	CVE-ID:	None

View Developer Edit

Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.

Password:

Status:
Package:
Bug Type:
Summary:
From:	care at novadys dot de
New email:
PHP Version:		OS:

New Comment:

[2014-01-16 14:43 UTC] care at novadys dot de

Description:
------------
Using a UTF-8 String as input for a com-function will generate a wrong string in COM interface. If the input string is containing n-chars which are encoded with 2 bytes, the length of resulting string is n byte too long. 

example: "I want to Dusseldorf and Koln" is correctly handled
"I want to Düsseldorf and Köln" will call COM function with a string:
"I want to Düsseldorf and Köln\0\4"  

reason:

ext/com_dotnet/com_variant.c

...
PHP_COM_DOTNET_API void php_com_variant_from_zval(VARIANT *v, zval *z, int codepage TSRMLS_DC)

...

case IS_STRING:
                        V_VT(v) = VT_BSTR;
                        olestring = php_com_string_to_olestring(Z_STRVAL_P(z), Z_STRLEN_P(z), codepage TSRMLS_CC);
                        
here is the problem:

V_BSTR(v) = SysAllocStringByteLen((char*)olestring, Z_STRLEN_P(z) * sizeof(OLECHAR));

When input string is UTF-8 encoded Z_STRLEN_P(z) has a count of 2 byte for each "special" char. So length of input string is count of all chars + count of "special" chars. After conversion to olestring Z_STRLEN_P(z) is the wrong length the olestring is shorter. So SysAllocStringByteLen is allocating too much memory and while the for the string length of olestring. In result those strings are always showing a '\0' for the first special char + n-1 random chars for the remaing special chars.

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports

[2014-01-16 20:28 UTC] ab@php.net

-Status: Open +Status: Feedback

[2014-01-16 20:28 UTC] ab@php.net

Were it possible you to post a repro snippet? Thanks.

[2014-01-17 14:56 UTC] care at novadys dot de

-Status: Feedback +Status: Open

[2014-01-17 14:56 UTC] care at novadys dot de

It works correctly when replacing the line

V_BSTR(v) = SysAllocStringByteLen((char*)olestring, Z_STRLEN_P(z) * sizeof(OLECHAR));

by

V_BSTR(v) = SysAllocStringByteLen((char*)olestring, wcslen(olestring) * sizeof(OLECHAR));

[2014-01-17 19:52 UTC] ab@php.net

-Status: Open +Status: Feedback

[2014-01-17 19:52 UTC] ab@php.net

We need a code snippet in PHP to fix and write a test.

Thanks.

[2014-04-29 11:51 UTC] ab@php.net

-Status: Feedback +Status: Duplicate

[2014-04-29 11:51 UTC] ab@php.net

This is fixed now as it's the same as bug #66431, please check.

Thanks.

	php.net \| support \| documentation \| report a bug \| advanced search \| search howto \| statistics \| random bug \| login
go to bug id or search bugs for


Copyright © 2001-2025 The PHP Group All rights reserved.	Last updated: Sat Jul 12 02:01:35 2025 UTC