php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #37899 [PATCH] php_char_to _OLECHAR copies junk bytes
Submitted: 2006-06-23 05:09 UTC Modified: 2020-02-03 08:55 UTC
Votes:15
Avg. Score:4.1 ± 1.0
Reproduced:8 of 8 (100.0%)
Same Version:3 (37.5%)
Same OS:6 (75.0%)
From: okumurya at gmail dot com Assigned: cmb (profile)
Status: Duplicate Package: COM related
PHP Version: 5.2.3 OS: Windows 2000 Professional
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: okumurya at gmail dot com
New email:
PHP Version: OS:

 

 [2006-06-23 05:09 UTC] okumurya at gmail dot com
Description:
------------
In UTF-8, 'Z_STRLEN_P(pval_arg) * sizeof(OLECHAR)' is longer than Widechar.
So SysAllocStringByteLen copies junk datas.

following is a patch.

--- conversion.c.orig	2006-06-23 11:28:06.496027200 +0900
+++ conversion.c	2006-06-23 14:01:42.838476800 +0900
@@ -247,8 +247,9 @@
 
 			case VT_BSTR:
 				convert_to_string_ex(&pval_arg);
+                  
 				unicode_str = php_char_to_OLECHAR(Z_STRVAL_P(pval_arg), Z_STRLEN_P(pval_arg), codepage TSRMLS_CC);
-				V_BSTR(var_arg) = SysAllocStringByteLen((char *) unicode_str, Z_STRLEN_P(pval_arg) * sizeof(OLECHAR));
+				V_BSTR(var_arg) = SysAllocString(unicode_str);
 				efree(unicode_str);
 				break;


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2006-07-03 07:31 UTC] okumurya at hotmail dot com
I misunderstand.
Revised patch is following.

--- ext\com\conversion.c.orig	2006-01-01 22:46:50.000000000 +0900
+++ ext\com\conversion.c	2006-06-23 14:01:42.838476800 +0900
@@ -247,8 +247,9 @@
 
 			case VT_BSTR:
 				convert_to_string_ex(&pval_arg);
+                  
 				unicode_str = php_char_to_OLECHAR(Z_STRVAL_P(pval_arg), Z_STRLEN_P(pval_arg), codepage TSRMLS_CC);
-				V_BSTR(var_arg) = SysAllocStringByteLen((char *) unicode_str, Z_STRLEN_P(pval_arg) * sizeof(OLECHAR));
+				V_BSTR(var_arg) = SysAllocString(unicode_str);
 				efree(unicode_str);
 				break;
 
@@ -787,20 +788,15 @@
 {
 	BOOL error = FALSE;
 	OLECHAR *unicode_str;
+	uint unicode_strlen;
 
-	if (strlen == -1) {
-		/* request needed buffersize */
-		strlen = MultiByteToWideChar(codepage, (codepage == CP_UTF8 ? 0 : MB_PRECOMPOSED | MB_ERR_INVALID_CHARS), C_str, -1, NULL, 0);
-	} else {
-		/* \0 terminator */
-		strlen++;
-	}
+	unicode_strlen = MultiByteToWideChar(codepage, (codepage == CP_UTF8 ? 0 : MB_PRECOMPOSED | MB_ERR_INVALID_CHARS), C_str, -1, NULL, 0);
 
-	if (strlen >= 0) {
-		unicode_str = (OLECHAR *) emalloc(sizeof(OLECHAR) * strlen);
+	if (unicode_strlen > 0) {
+		unicode_str = (OLECHAR *) emalloc(sizeof(OLECHAR) * unicode_strlen);
 
 		/* convert string */
-		error = !MultiByteToWideChar(codepage, (codepage == CP_UTF8 ? 0 : MB_PRECOMPOSED | MB_ERR_INVALID_CHARS), C_str, strlen, unicode_str, strlen);
+		error = !MultiByteToWideChar(codepage, (codepage == CP_UTF8 ? 0 : MB_PRECOMPOSED | MB_ERR_INVALID_CHARS), C_str, -1, unicode_str, unicode_strlen);
 	} else {
 		/* return a zero-length string */
 		unicode_str = (OLECHAR *) emalloc(sizeof(OLECHAR));
 [2007-02-26 13:03 UTC] fjortiz at comunet dot es
Any updates on this issue?. We tracked down one problem we had with ADODBCommand COM objects and Russian UTF-8 strings and found this was the bug that was causing it. 

Mr. Okumurya, please one question: did your last patch work for you?

Thanks in advance.
 [2007-03-05 22:43 UTC] okumurya at hotmail dot com
Hello fjortiz

> did your last patch work for you?

Yes my patch works.
 [2007-04-02 20:29 UTC] stas@php.net
Could you provide some example reproducing the problem? Preferably with some COM available on standard Windows/Office machine or easily installable.
 [2007-07-11 12:49 UTC] jani@php.net
And does this problem exist in PHP 5.2.3? (see also previous comment by Stas)
 [2007-08-12 07:04 UTC] okumurya at gmail dot com
I tested following code on PHP-4.4.7, the problem still occured.

<?php
/*
 * via: http://ml.php.gr.jp/pipermail/php-users/2007-April/032463.html
 */
ini_set('mbstring.language', 'Japanese');
ini_set('mbstring.internal_encoding', 'UTF-8');

class setWord {
	var $Word;
	var $doc;
 
	function setWord(){
		$this->Word = new COM('Word.Application');
		$this->Word->Visible = false;
		$this->Word->DisplayAlerts = 0;
	}
 
	function documentOpen(){
		$this->doc = $this->Word->Documents->Add();
		$this->doc->Activate;
	}
 
	function setParam($param){
		$this->Word->Selection->TypeText($param);
	}
 
	function SaveAs(){
		$this->Word->Documents[1]->SaveAs("C:/TMP/hoge.doc");
	}
 
	function unsetObj(){
		$this->Word->Documents[1]->Close();
		$this->Word->Quit();
		$this->Word = null;
	}
 
	function returnWord($param){
		$this->documentOpen();
		$this->setParam($param);
		$this->SaveAs();
		$this->unsetObj();
	}
}

$cls = new setWord();
$cls->returnWord("\x82\xA0\x82\xA2\x82\xA4\x82\xA6\x82\xA8"); // Japanese Hiragana A I U E O
?>
 [2007-08-12 07:14 UTC] okumurya at gmail dot com
I tested on PHP-5.2.3, and the problem still occured.
 [2007-08-24 12:16 UTC] jani@php.net
Assigned to the maintainer.
 [2008-10-21 20:20 UTC] vincent at eal dot com
I also have the same problem with PHP 5.2.3.
 [2012-04-06 11:10 UTC] potyomkine at gmail dot com
I also have the same problem with PHP 5.2.17.
 [2017-10-24 03:32 UTC] kalle@php.net
-Status: Assigned +Status: Open -Assigned To: wharmby +Assigned To:
 [2020-02-03 08:55 UTC] cmb@php.net
-Status: Open +Status: Duplicate -Assigned To: +Assigned To: cmb
 [2020-02-03 08:55 UTC] cmb@php.net
This appears to be a duplicate of bug #66431, which is fixed as of
PHP 5.4.29 and 5.5.13, respectively.
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Thu Jan 30 07:01:31 2025 UTC