php.net |  support |  documentation |  report a bug |  advanced search |  search howto |  statistics |  random bug |  login
Doc Bug #52453 Connection charset seems arbitrary
Submitted: 2010-07-27 11:00 UTC Modified: 2010-10-24 22:45 UTC
Votes:2
Avg. Score:2.5 ± 0.5
Reproduced:1 of 1 (100.0%)
Same Version:0 (0.0%)
Same OS:1 (100.0%)
From: jeroen at asystance dot nl Assigned: kalle (profile)
Status: Closed Package: Documentation problem
PHP Version: 5.3.3 OS: Linux
Private report: No CVE-ID: None
 [2010-07-27 11:00 UTC] jeroen at asystance dot nl
Description:
------------
The connection_charset used by a mysql/mysqli connection seems arbitrary - at least I cannot figure out how it is determined. The documentation provides no clues as to which charset is used by default.

I've tried connecting to different mysql servers from different shell servers and can't figure out how the default charset is determined. As to find out which one is used, open a mysql/mysqli (procedural or object-oriented doesn't matter) connection and use mysql_client_encoding() / mysqli_get_charset() or "SHOW VARIABLES LIKE 'character_set%';" to find out.

This probably is just a documentation problem, but maybe the default could be chosen more sensibly: for example, the mysql server database's charset seems the most sensible default.

For example, connecting from a shell that has en_US.UTF-8 as locale, I get:
character_set_client: utf8
character_set_connection: utf8
character_set_database: utf8
character_set_filesystem: binary
character_set_results: utf8
character_set_server: utf8
character_set_system: utf8
character_sets_dir: /usr/share/mysql/charsets/

Switching to en_US.iso88591 doesn't change anything. So it would seem some server setting determines the charset, right? However, connecting to the same mysql server from another system (though from intranet instead of internet), I get:
character_set_client: latin1
character_set_connection: latin1
character_set_database: utf8
character_set_filesystem: binary
character_set_results: latin1
character_set_server: utf8
character_set_system: utf8
character_sets_dir: /usr/share/mysql/charsets/

Again, the client locale doesn't influence this.


Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
 [2010-07-27 12:09 UTC] jeroen at asystance dot nl
BTW: when I make the same connection as in the first example, but use the mysql CLI client instead of php, I get
mysql> SHOW VARIABLES LIKE 'character_set%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | latin1                     |
| character_set_connection | latin1                     |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | latin1                     |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

So PHP's mysql_connect() behaves differently from mysql cli!
 [2010-07-27 12:23 UTC] gerben at asystance dot nl
I connected to the same remote database from a Windows/WAMP system and got exactly the same results.
 [2010-08-13 12:00 UTC] andrey@php.net
-Status: Open +Status: Feedback -Assigned To: +Assigned To: mysql
 [2010-08-13 12:00 UTC] andrey@php.net
Are you using mysqlnd or libmysql. Libmysql uses its default charset, or set in my.cnf or by explicit call to mysqli_options before calling mysqli_real_connect, but after mysqli_init().
If you use mysqlnd, in this case mysqlnd always assumes the server default charset. This charset is sent during connection hand-shake/authentication and mysqlnd assumes it.
 [2010-08-13 12:43 UTC] jeroen at asystance dot nl
-Status: Feedback +Status: Assigned -Type: Feature/Change Request +Type: Documentation Problem
 [2010-08-13 12:43 UTC] jeroen at asystance dot nl
A ha! That's exactly the documentation I was looking for online. I've changed the type to "documentation problem" and think the proper way to resolve this bug is to add this documentation.

I was using mysqlnd by the way (standard setting on ubuntu I guess).
 [2010-08-18 12:44 UTC] andrey@php.net
You can overwrite the charset, when using mysqlnd, by calling mysqli_options on a connection created with mysqli_init() but still not connected with mysqli_real_connect(). In this case mysqlnd won't assume the server default charset.
 [2010-08-18 12:44 UTC] andrey@php.net
-Status: Assigned +Status: Verified -Assigned To: mysql +Assigned To:
 [2010-08-18 12:44 UTC] andrey@php.net
the doc team should take over this one
 [2010-10-22 14:40 UTC] kalle@php.net
-Package: MySQLi related +Package: Documentation problem
 [2010-10-24 22:45 UTC] kalle@php.net
Automatic comment from SVN on behalf of kalle
Revision: http://svn.php.net/viewvc/?view=revision&revision=304709
Log: Fixed #52453 (Connection charset seems arbitrary)
 [2010-10-24 22:45 UTC] kalle@php.net
-Status: Verified +Status: Closed -Assigned To: +Assigned To: kalle
 [2010-10-24 22:45 UTC] kalle@php.net
This bug has been fixed in the documentation's XML sources. Since the
online and downloadable versions of the documentation need some time
to get updated, we would like to ask you to be a bit patient.

Thank you for the report, and for helping us make our documentation better.


 [2020-02-07 06:08 UTC] phpdocbot@php.net
Automatic comment on behalf of kalle
Revision: http://git.php.net/?p=doc/en.git;a=commit;h=ec0961f421f0b97ac50e64ab9931c9b2d991d176
Log: Fixed #52453 (Connection charset seems arbitrary)
 
PHP Copyright © 2001-2025 The PHP Group
All rights reserved.
Last updated: Mon Jul 14 00:01:34 2025 UTC