php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #64008 PDO Informix returns corrupted strings when character conversion is active
Submitted: 2013-01-16 16:55 UTC Modified: 2016-02-07 15:43 UTC
Votes:3
Avg. Score:4.7 ± 0.5
Reproduced:2 of 2 (100.0%)
Same Version:0 (0.0%)
Same OS:0 (0.0%)
From: andreas dot streichardt at 100days dot de Assigned: vnkbabu (profile)
Status: Assigned Package: PDO_INFORMIX (PECL)
PHP Version: 5.4.10 OS: OpenSuse 12.2
Private report: No CVE-ID: None
 [2013-01-16 16:55 UTC] andreas dot streichardt at 100days dot de
Description:
------------
We got an old database (SE) with DB_LOCALE=de_DE.cp1252

We are redesigning the frontend to be an UTF-8 webapplication and as the client 
development kit supports automatic character set conversion we tried setting the 
CLIENT_LOCALE to de_DE-utf8

Using these settings, conversion is taking place but the affected strings seem 
to contain corrupted memory!

The affected field in the example is a CHAR(40). 120 bytes indicate that it is 
trying to allocate as many bytes as necessary for a BMP utf-8 char * 40.

The problem persists if we switch to a newer Informix online server so the bug 
seems to be PDO-informix related or even related to the CSDK.

CSDK used: 

/usr/lib/informix/bin/ifx_getversion clientsdk
IBM/Informix-Client SDK Version 3.70.UC5DE
Copyright (C) 1991-2012 IBM

PDO_Informix: 1.3.0

The problem appears when using prepare() as well as when using query().

Test script:
---------------
$connection = new PDO('informix:host=172.31.5.122;service=1523;database=/u2/rel6.1.1/test.fir/test;server=se_tis_risc_neu;protocol=sesoctcp;EnableScrollableCursors=1;DB_LOCALE=de_DE.cp1252;CLIENT_LOCALE=de_DE.utf8', 'develop', 'develop');
$stmt = $connection->query("SELECT kd_nr, kd_ans1 FROM kunden WHERE kd_nr=18259");

foreach ($stmt as $row) {
    var_dump($row);
}



Expected result:
----------------
The result should not contain any corrupted memory.

Actual result:
--------------
array(4) {
  'KD_NR' =>
  string(5) "18259"
  [0] =>
  string(5) "18259"
  'KD_ANS1' =>
  string(120) "Jörn Möllemann                          
\000\000oc=\000\000\0005\000\000\000SELECT kd_nr, kd_ans1 FROM kunden WHERE 
kd_nr=18259\000-\000\000\000=\000\000\000�|a\020\005\000\000\000"
  [1] =>
  string(120) "Jörn Möllemann                          
\000\000oc=\000\000\0005\000\000\000SELECT kd_nr, kd_ans1 FROM kunden WHERE 
kd_nr=18259\000-\000\000\000=\000\000\000�|a\020\005\000\000\000"
}

Note that the name has been converted properly (it is valid utf8) but the string 
is corrupt from character 40 (=field length) on.

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2013-02-18 15:38 UTC] logobinder at o2 dot pl
Hello. I have similar issue. 
My db locale is pl_PL.912 but i want data in pl_PL.utf8. 

Row where are special chars is corrupted like you described.  

PDO_INFORMIX - 1.2.6 
PHP - 5.3.14 (cli) 
/opt/informix/bin/ifx_getversion clientsdk
IBM/Informix-Client SDK Version 3.70.UC5DE
Copyright (C) 1991-2012 IBM
 [2013-02-26 14:36 UTC] javier dot sagrera at uk dot ibm dot com
The reason for the "corruption" is the missing null terminator.
The conversion between codesets generates a -21002 "Codeset conversion output buffer too small." error because the buffer PHP allocates for the SQLBindCol is not large enough to hold the converted data.

In case of a char(40) with two multibyte chars like the example, the UTF8 string will be 42 bytes in size (because the spaces are part of the string). 
Doing a trim() or using varchar does workaround the problem.

Another quick workaround could be changing the code inside "informix_statement.c" in the function stmt_bind_column() from:

case SQL_NUMERIC:
	default:
		in_length = col_res->data_size + in_length;
		col_res->data.str_val = (char *) emalloc(in_length+1);
		check_stmt_allocation(col_res->data.str_val,
				"stmt_bind_column",
				"Unable to allocate column buffer");
		col_res->data.str_val[in_length] = '\0';
		...

to:

	case SQL_NUMERIC:
	default:			

		in_length = (col_res->data_size + in_length) * 2 ; 
			// Ensure there is enough space for UTF8 data
				
		col_res->data.str_val = (char *) emalloc(in_length+1);
		check_stmt_allocation(col_res->data.str_val,
				"stmt_bind_column",
				"Unable to allocate column buffer");
		col_res->data.str_val[in_length] = '\0';
		...
 [2013-02-26 16:57 UTC] andreas dot streichardt at 100days dot de
Hi javier,

assuming that informix is supporting UTF-16 internally the multiplicator should be 
3, shouldn't it (2 is sufficient for german umlauts but not for chinese for 
example)? We are currently testing several usecases and will notify you through 
the IBM Ticket System once we are done. First tests seem promising :)
 [2013-03-07 11:07 UTC] javier dot sagrera at uk dot ibm dot com
Yes, *3 will be a lot safer (I was thinking on de_DE.cp1252)
Still this is just a quick workaround. Ideally the PDO driver should check that there was a truncation error during the fetch, reallocate the buffer for the bind and redo the fetch.
 [2014-02-14 10:01 UTC] rahulpriyadarshi@php.net
-Assigned To: +Assigned To: rahulpriyadarshi
 [2014-02-14 10:01 UTC] rahulpriyadarshi@php.net
I am going to use buffer-size factor of 4 since multibyte character has a maximum size of 4 bytes. And same is also suggested at http://publib.boulder.ibm.com/infocenter/idshelp/v111/topic/com.ibm.glsug.doc/sii06974601.htm . 

I am working on it and will shortly provide a patch.
 [2014-02-20 11:34 UTC] rahulpriyadarshi@php.net
I have committed the changes required for this issue to svn branch. It is working fine in my environment, please give a try to this patch and let me know hoe it works for you.
 [2016-02-07 15:43 UTC] rahulpriyadarshi@php.net
-Assigned To: rahulpriyadarshi +Assigned To: vnkbabu
 [2017-10-05 14:56 UTC] stephane dot gerber at unil dot ch
The solution is working but the fix has not been included in 
https://pecl.php.net/package/PDO_INFORMIX latest release 1.3.3
 [2018-11-28 10:44 UTC] as at nsi dot de
I have the same problem. Please fix it in the next release.
 [2022-01-27 17:15 UTC] laule at lunis-it dot de
seems to be fixed in 1.3.6 (works for me)
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Thu Nov 14 17:01:30 2024 UTC