php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Bug #18169 Driver cannot deliver UCS-2 unicode to SQL Server
Submitted: 2002-07-04 18:10 UTC Modified: 2003-06-29 21:33 UTC
Votes:69
Avg. Score:4.7 ± 0.6
Reproduced:59 of 60 (98.3%)
Same Version:10 (16.9%)
Same OS:33 (55.9%)
From: joesterg at hotmail dot com Assigned:
Status: No Feedback Package: MSSQL related
PHP Version: 4.1.2 OS: Windows 2000 Server
Private report: No CVE-ID: None
Welcome back! If you're the original bug submitter, here's where you can edit the bug or add additional notes.
If this is not your bug, you can add a comment by following this link.
If this is your bug, but you forgot your password, you can retrieve your password here.
Password:
Status:
Package:
Bug Type:
Summary:
From: joesterg at hotmail dot com
New email:
PHP Version: OS:

 

 [2002-07-04 18:10 UTC] joesterg at hotmail dot com
I have a problem converting UTF-8 (web character encoding) to UCS2 (Microsoft Windows character encoding) using PHP, and storing this in the Microsoft SQL Server 2000 database.

My setup is:
Windows 2000 Server, with Apache 1.3.24/PHP 4.1.1 and Microsoft SQL Server 2000

Now, as a result of Microsofts Q232580, I will have to do conversion between UTF-8 and UCS-2. For this, I thought I would use the Multibyte String functions.
However, this does not seem to work.

I am absolutely sure, that I input UTF-8 encoded data into my string, and then I do:
$ucs2string=mb_convert_encoding($string,"UCS2","UTF-8");
$sqlStmt="insert into testtbl (tekst) values(N'".($ucs2string)."')";
$rs=$DBCon->Execute($sqlStmt);

When I access the database, then I will see something stored, that does not resemble the input at all (most times, I see Japanese/Chinese characters?!??). Furthermore, the insert sometimes comes up with an error, and consequently stores nothing.

To me, it seems like either one of these (or both) are flawed:
1. the Multibyte String encoding funtion does not work properly (ie. encoding from UTF-8 to UCS-2 does not happen correctly).
2. The PHP MSSQL driver does not handle unicode data properly, even though the target column in the database is specified as Unicode and N is prepended to the string before insert.

This leads me to use ADO (as in the example above), storing UTF-8 encoded data into SQL Server -this is a very short term solution, as data are not sortable in the database (some of it looks like garbage because of the
missing encoding).

Patches

Add a Patch

Pull Requests

Add a Pull Request

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2002-07-05 04:14 UTC] yohgaki@php.net
Wide char encoding, UCS2/UCS4/UTF16/UTF32, don't work well with current PHP. I guess SQL Server module is using strlen() or like, that cannot be used with wide char...

Fixing this is not simple at all. 

 [2002-07-05 04:21 UTC] joesterg at hotmail dot com
You are probably right. However, Unicode is central to making world-wide web applications, and all major database vendors have this posibility.
I find it to be a hindrance to wider deployment of large-scale, worldwide php applications.

Does anyone know if it is only the MSSQL module? -are there any plans to look into this issue?

What are the future directions for PHP and Unicode support?
 [2002-07-05 05:34 UTC] markonen@php.net
PHP's mssql extension uses the Microsoft SQL Server's C 
API, the "DB-Library for C". Specifically, SQL queries are 
sent to the server using the dbcmd() function. This 
function is not binary safe, so inserting UCS2 text or 
images or any binary data is likely to fail.

The DB-Library for C has separate, binary-safe APIs for 
entering text and images, but they are complicated and 
difficult to seamlessly integrate to the current mssql 
extension. Look up the documentation for dbwritetext() if 
you feel like implementing this change.

UTF-8 and UTF-7 are, IIRC, the only Unicode encoding that 
are guaranteed not to include null characters. They are, 
therefore, the only encodings that can be reliably used 
with PHP's mssql extension at this time.
 [2002-07-06 07:08 UTC] joesterg at hotmail dot com
Thanks Marko

-I guess this means that if you are to use binary (ie. unicode) data, then COM/ADO is your only option, if SQL Server is the database of your choice.

From yohgaki's answer, I guess also the multibyte encoding functionality lacks proper Unicode support -am I correct in assuming that we will have to move to PHP4.2.x and do our own encoding/decoding through the Win32 API then?
 [2002-12-18 18:18 UTC] fvu at wanadoo dot nl
If you're using PHP on a Windows platform you can use the PHP COM extension to communicate with SQL Server via ADO.  The PHP COM extension is capable of translating UTF-8 to UCS-2 and back if you specify so as the third parameter:

  $oDb = new COM('ADODB.Connection', NULL, CP_UTF8);

This way you can use Unicode UTF-8 within PHP and Unicode UCS-2 within SQL Server with all the translations done for you automatically.

HTH, Freddy Vulto
 [2004-04-15 06:07 UTC] samlinxp at msn dot com
I have the same problem.

My setup is: 
Windows XP Server, with Apache 2.0.47/PHP 4.3.6RC3 and Microsoft SQL Server 2000

Hope this problem can be solved soon. This is quite important especially at Asia Pacific's regions (CHN, HK, TW, JP, KR.. etc)
 [2006-04-20 01:24 UTC] timdilbert at gmail dot com
Just out of curiousity I was wondering if this was fixed in PHP5 and MSSQL 2005??

I haven't tried using COM just yet, but I will be when I get home (on MSSQL 2000). But I was having the same problem with PHP5 and inserting UTF-8 encoding into MSSQL Server 2000.

I will post if this fixed my problem. If not, I'm really sorry guys.. I love PHP, but I might be be rebuilding my entire site is C# because Unicode support is absolutly vital to our company and success.
 [2006-09-09 01:18 UTC] aireater at gmail dot com
I still have the same issue with the latest Windows binary 5.1.6 and MS SQL Server 2005 Express, Standard, Enterprise on Windows XP, 2003 Server. It never works.
 [2006-09-21 06:59 UTC] gautam dot webprogram at yahoo dot com
I want to connect php with MS SQL Server 2000 I have used the folloing code in PHP:

<? 
    $connection = mysql_connect ("localhost","user name","password"); 
	
	if (!$connection) { 
        echo "Couldn't make a connection!"; 
        exit; 
    } 
	?>



The code doesn't execute.
 [2007-06-14 08:37 UTC] giannisptr at yahoo dot gr
Has anyone found a solution to the encoding problem?????
I am using a web service to get a string of data from an sql server 2000 database. When i invoke the web service from php, greek characters are replaced by character '?'.
 [2007-11-11 01:46 UTC] etraxis at gmail dot com
Looks like nobody is going to fix the issue. ;(
I haven't solution but I have workaround that I use in my project and it works - it's sending and receiving data as binary.

=========
 Example
=========

Let's assume, we have following data table, that allows us to store unicode values (using UCS-2 encoding):

    create table mytable
    (
        myfield nvarchar (100) null
    );

Below is the code to work with:

    $link = mssql_connect('dbhost', 'username', 'password');
    mssql_select_db('database', $link);

    // sending data to database
    $utf8 = 'some unicode UTF-8 data';  // some Greek text for example ;)
    $ucs2 = iconv('UTF-8', 'UCS-2LE', $utf8);

    // converting UCS-2 string into "binary" hexadecimal form
    $arr = unpack('H*hex', $ucs2);
    $hex = "0x{$arr['hex']}";

    // IMPORTANT!
    // please note that value must be passed without apostrophes
    // it should be "... values(0x0123456789ABCEF) ...", not "... values('0x0123456789ABCEF') ..."
    mssql_query("insert into mytable (myfield) values ({$hex})", $link);

    // retrieving data from database
    // IMPORTANT!
    // please note that "varbinary" expects number of bytes
    // in this example it must be 200 (bytes), while size of field is 100 (UCS-2 chars)
    $result = mssql_query("select convert(varbinary(200), myfield) from mytable", $link);

    while (($row = mssql_fetch_array($result, MSSQL_BOTH)))
    {
        // we get data in UCS-2
        // I use UTF-8 in my project, so I encode it back
        echo(iconv('UCS-2LE', 'UTF-8', $row['myfield']));
    }

    mssql_free_result($result);
    mssql_close($link);
 [2008-11-27 15:54 UTC] alex dot bazan at concatel dot com
Pinging this bug. Opened in 2002 and still without an answer. I'm having trouble too with non-latin characters with Sql Server.
 
PHP Copyright © 2001-2024 The PHP Group
All rights reserved.
Last updated: Fri Apr 19 05:01:29 2024 UTC