|
php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login |
[2007-02-28 11:38 UTC] fjortiz at comunet dot es
Description: ------------ when converting a UTF-8 encoded, multibyte string (Russian for example), to a COM string, a wrong number of bytes are allocated, thus creating the COM string with junk bytes at the end. For example, when I pass my COM-based ADODB driver a 5-letter word in Russian, I get at destination a 10 (5*2) characters string, the first 5 are the right Russian chars and the rest 5 are junk characters. This was also explained for 4.4.2 in bug #37899 Actual result: -------------- this is solved patching two files: \ext\com_dotnet\com_variant.c, function php_com_variant_from_zval, line 156: 156,157c156 < V_BSTR(v) = SysAllocString((char*)olestring); --- > V_BSTR(v) = SysAllocStringByteLen((char*)olestring, Z_STRLEN_P(z) * sizeof(OLECHAR)); \ext\com_dotnet\com_olechar.c, function php_com_string_to_olestring: 37d36 < uint unicode_strlen; 39,40c38,44 < unicode_strlen = MultiByteToWideChar(codepage, (codepage == CP_UTF8 ? < 0 : MB_PRECOMPOSED | MB_ERR_INVALID_CHARS), string, -1, NULL, 0); --- > if (string_len == -1) { > /* determine required length for the buffer (includes NUL terminator) */ > string_len = MultiByteToWideChar(codepage, flags, string, -1, NULL, 0); > } else { > /* allow room for NUL terminator */ > string_len++; > } 42,44c46,48 < if (unicode_strlen > 0) { < olestring = (OLECHAR*)safe_emalloc(sizeof(OLECHAR), unicode_strlen, 0); < ok = MultiByteToWideChar(codepage, flags, string, -1, olestring, unicode_srlen); --- > if (strlen > 0) { > olestring = (OLECHAR*)safe_emalloc(sizeof(OLECHAR), string_len, 0); > ok = MultiByteToWideChar(codepage, flags, string, string_len, olestring, string_len); PatchesPull RequestsHistoryAllCommentsChangesGit/SVN commits
|
|||||||||||||||||||||||||||||||||||||
Copyright © 2001-2025 The PHP GroupAll rights reserved. |
Last updated: Thu Oct 30 02:00:01 2025 UTC |
i found this bug too, so code is like this: $word = new COM("Word.Application", null, CP_UTF8); $word->Visible = true; $doc = $word->Documents->Add(); $word->Selection->TypeText( "UTF8 text here, russian in my case" ); In word appears "Normal text " + garbage. can this be fixed in CVS ? i have to compile php on win32 by myself ;)I see same problem with ADODB.Connection and it's parameters. When I use: // init connection to DB $dbc = new COM('ADODB.Connection',null,CP_UTF8); $dbc->Open("PROVIDER=MSDASQL;Driver={SQL Server};Server=192.168.210.1;Database=test;UID=test;PWD=test"); // create command $oCmd = new COM('ADODB.Command',null,CP_UTF8); $oCmd->ActiveConnection = $dbc; // // Table test1 has one row c1 of type nvarchar(200) $oCmd->CommandText = "INSERT INTO test1(c1) VALUES (?)"; $oCmd->CommandType = 1; // Some UTF-8 string (length in characters is 15, 24 in bytes) $val='ABCříšúěďáéóXYZ'; $len=strlen($val); $p=$oCmd->CreateParameter('name',202,1,$len,$val); $oCmd->Parameters->Append($p); $oCmd->Execute(); // ADODB sends to DB nvarchar(24) which is length of string // in bytes not characters, but data has 15 characters // in UCS-2. So in database there is correct string, but there is // some garbage after end of string