php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #37899 [PATCH] php_char_to _OLECHAR copies junk bytes
Submitted: 2006-06-23 05:09 UTC Modified: 2020-02-03 08:55 UTC
Votes:15
Avg. Score:4.1 ± 1.0
Reproduced:8 of 8 (100.0%)
Same Version:3 (37.5%)
Same OS:6 (75.0%)
From: okumurya at gmail dot com Assigned: cmb (profile)
Status: Duplicate Package: COM related
PHP Version: 5.2.3 OS: Windows 2000 Professional
Private report: No CVE-ID: None
View Add Comment Developer Edit
Anyone can comment on a bug. Have a simpler test case? Does it work for you on a different platform? Let us know!
Just going to say 'Me too!'? Don't clutter the database with that please !
Your email address:
MUST BE VALID
Solve the problem:
19 + 40 = ?
Subscribe to this entry?

 
 [2006-06-23 05:09 UTC] okumurya at gmail dot com
Description:
------------
In UTF-8, 'Z_STRLEN_P(pval_arg) * sizeof(OLECHAR)' is longer than Widechar.
So SysAllocStringByteLen copies junk datas.

following is a patch.

--- conversion.c.orig	2006-06-23 11:28:06.496027200 +0900
+++ conversion.c	2006-06-23 14:01:42.838476800 +0900
@@ -247,8 +247,9 @@
 
 			case VT_BSTR:
 				convert_to_string_ex(&pval_arg);
+                  
 				unicode_str = php_char_to_OLECHAR(Z_STRVAL_P(pval_arg), Z_STRLEN_P(pval_arg), codepage TSRMLS_CC);
-				V_BSTR(var_arg) = SysAllocStringByteLen((char *) unicode_str, Z_STRLEN_P(pval_arg) * sizeof(OLECHAR));
+				V_BSTR(var_arg) = SysAllocString(unicode_str);
 				efree(unicode_str);
 				break;


Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-07-03 07:31 UTC] okumurya at hotmail dot com
I misunderstand.
Revised patch is following.

--- ext\com\conversion.c.orig	2006-01-01 22:46:50.000000000 +0900
+++ ext\com\conversion.c	2006-06-23 14:01:42.838476800 +0900
@@ -247,8 +247,9 @@
 
 			case VT_BSTR:
 				convert_to_string_ex(&pval_arg);
+                  
 				unicode_str = php_char_to_OLECHAR(Z_STRVAL_P(pval_arg), Z_STRLEN_P(pval_arg), codepage TSRMLS_CC);
-				V_BSTR(var_arg) = SysAllocStringByteLen((char *) unicode_str, Z_STRLEN_P(pval_arg) * sizeof(OLECHAR));
+				V_BSTR(var_arg) = SysAllocString(unicode_str);
 				efree(unicode_str);
 				break;
 
@@ -787,20 +788,15 @@
 {
 	BOOL error = FALSE;
 	OLECHAR *unicode_str;
+	uint unicode_strlen;
 
-	if (strlen == -1) {
-		/* request needed buffersize */
-		strlen = MultiByteToWideChar(codepage, (codepage == CP_UTF8 ? 0 : MB_PRECOMPOSED | MB_ERR_INVALID_CHARS), C_str, -1, NULL, 0);
-	} else {
-		/* \0 terminator */
-		strlen++;
-	}
+	unicode_strlen = MultiByteToWideChar(codepage, (codepage == CP_UTF8 ? 0 : MB_PRECOMPOSED | MB_ERR_INVALID_CHARS), C_str, -1, NULL, 0);
 
-	if (strlen >= 0) {
-		unicode_str = (OLECHAR *) emalloc(sizeof(OLECHAR) * strlen);
+	if (unicode_strlen > 0) {
+		unicode_str = (OLECHAR *) emalloc(sizeof(OLECHAR) * unicode_strlen);
 
 		/* convert string */
-		error = !MultiByteToWideChar(codepage, (codepage == CP_UTF8 ? 0 : MB_PRECOMPOSED | MB_ERR_INVALID_CHARS), C_str, strlen, unicode_str, strlen);
+		error = !MultiByteToWideChar(codepage, (codepage == CP_UTF8 ? 0 : MB_PRECOMPOSED | MB_ERR_INVALID_CHARS), C_str, -1, unicode_str, unicode_strlen);
 	} else {
 		/* return a zero-length string */
 		unicode_str = (OLECHAR *) emalloc(sizeof(OLECHAR));
 [2007-02-26 13:03 UTC] fjortiz at comunet dot es
Any updates on this issue?. We tracked down one problem we had with ADODBCommand COM objects and Russian UTF-8 strings and found this was the bug that was causing it. 

Mr. Okumurya, please one question: did your last patch work for you?

Thanks in advance.
 [2007-03-05 22:43 UTC] okumurya at hotmail dot com
Hello fjortiz

> did your last patch work for you?

Yes my patch works.
 [2007-04-02 20:29 UTC] stas@php.net
Could you provide some example reproducing the problem? Preferably with some COM available on standard Windows/Office machine or easily installable.
 [2007-07-11 12:49 UTC] jani@php.net
And does this problem exist in PHP 5.2.3? (see also previous comment by Stas)
 [2007-08-12 07:04 UTC] okumurya at gmail dot com
I tested following code on PHP-4.4.7, the problem still occured.

<?php
/*
 * via: http://ml.php.gr.jp/pipermail/php-users/2007-April/032463.html
 */
ini_set('mbstring.language', 'Japanese');
ini_set('mbstring.internal_encoding', 'UTF-8');

class setWord {
	var $Word;
	var $doc;
 
	function setWord(){
		$this->Word = new COM('Word.Application');
		$this->Word->Visible = false;
		$this->Word->DisplayAlerts = 0;
	}
 
	function documentOpen(){
		$this->doc = $this->Word->Documents->Add();
		$this->doc->Activate;
	}
 
	function setParam($param){
		$this->Word->Selection->TypeText($param);
	}
 
	function SaveAs(){
		$this->Word->Documents[1]->SaveAs("C:/TMP/hoge.doc");
	}
 
	function unsetObj(){
		$this->Word->Documents[1]->Close();
		$this->Word->Quit();
		$this->Word = null;
	}
 
	function returnWord($param){
		$this->documentOpen();
		$this->setParam($param);
		$this->SaveAs();
		$this->unsetObj();
	}
}

$cls = new setWord();
$cls->returnWord("\x82\xA0\x82\xA2\x82\xA4\x82\xA6\x82\xA8"); // Japanese Hiragana A I U E O
?>
 [2007-08-12 07:14 UTC] okumurya at gmail dot com
I tested on PHP-5.2.3, and the problem still occured.
 [2007-08-24 12:16 UTC] jani@php.net
Assigned to the maintainer.
 [2008-10-21 20:20 UTC] vincent at eal dot com
I also have the same problem with PHP 5.2.3.
 [2012-04-06 11:10 UTC] potyomkine at gmail dot com
I also have the same problem with PHP 5.2.17.
 [2017-10-24 03:32 UTC] kalle@php.net
-Status: Assigned +Status: Open -Assigned To: wharmby +Assigned To:
 [2020-02-03 08:55 UTC] cmb@php.net
-Status: Open +Status: Duplicate -Assigned To: +Assigned To: cmb
 [2020-02-03 08:55 UTC] cmb@php.net
This appears to be a duplicate of bug #66431, which is fixed as of
PHP 5.4.29 and 5.5.13, respectively.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Mar 29 00:01:28 2024 UTC