Spec-Zone .ru
спецификации, руководства, описания, API
|
Several character set and collation system variables relate to a client's interaction with the server. Some of these have been mentioned in earlier sections:
The server character set and collation are the values of the character_set_server
and collation_server
system variables.
The character set and collation of the default database are the values of the character_set_database
and collation_database
system variables.
Additional character set and collation system variables are involved in handling traffic for the connection between a client and the server. Every client has connection-related character set and collation system variables.
A "connection" is what you make when you connect to the server. The client sends SQL statements, such as queries, over the connection to the server. The server sends responses, such as result sets or error messages, over the connection back to the client. This leads to several questions about character set and collation handling for client connections, each of which can be answered in terms of system variables:
What character set is the statement in when it leaves the client?
The server takes the character_set_client
system variable to be the character set in which
statements are sent by the client.
What character set should the server translate a statement to after receiving it?
For this, the server uses the character_set_connection
and collation_connection
system variables. It converts statements sent by
the client from character_set_client
to character_set_connection
(except for string literals that have an
introducer such as _latin1
or _utf8
). collation_connection
is important for comparisons of literal
strings. For comparisons of strings with column values, collation_connection
does not matter because columns have their
own collation, which has a higher collation precedence.
What character set should the server translate to before shipping result sets or error messages back to the client?
The character_set_results
system variable indicates the character set in
which the server returns query results to the client. This includes result data such as column
values, and result metadata such as column names and error messages.
Clients can fine-tune the settings for these variables, or depend on the defaults (in which case, you can skip the rest of this section). If you do not use the defaults, you must change the character settings for each connection to the server.
Two statements affect the connection-related character set variables as a group:
SET NAMES '
charset_name
' [COLLATE 'collation_name
']
SET NAMES
indicates what character set the client will use to send SQL
statements to the server. Thus, SET NAMES 'cp1251'
tells the server,
"future incoming messages from this client are in character set cp1251
." It also specifies the character set that the
server should use for sending results back to the client. (For example, it indicates what character
set to use for column values if you use a SELECT
statement.)
A SET NAMES '
statement is equivalent to these three statements: charset_name
'
SET character_set_client =charset_name
;SET character_set_results =charset_name
;SET character_set_connection =charset_name
;
Setting character_set_connection
to charset_name
also implicitly sets collation_connection
to the default collation for charset_name
. It is unnecessary to set that collation
explicitly. To specify a particular collation, use the optional COLLATE
clause:
SET NAMES 'charset_name
' COLLATE 'collation_name
'
SET CHARACTER SET
charset_name
SET CHARACTER SET
is similar to SET NAMES
but sets character_set_connection
and collation_connection
to character_set_database
and collation_database
. A SET CHARACTER SET
statement is equivalent to these
three statements: charset_name
SET character_set_client =charset_name
;SET character_set_results =charset_name
;SET collation_connection = @@collation_database;
Setting collation_connection
also implicitly sets character_set_connection
to the character set associated with the
collation (equivalent to executing SET character_set_connection =
@@character_set_database
). It is unnecessary to set character_set_connection
explicitly.
ucs2
, utf16
, utf16le
, and utf32
cannot be used as a client
character set, which means that they do not work for SET NAMES
or SET CHARACTER SET
.
The MySQL client programs mysql
, mysqladmin
, mysqlcheck
, mysqlimport
, and mysqlshow
determine the default character set to use as follows:
In the absence of other information, the programs use the compiled-in default
character set, usually latin1
.
The programs can autodetect which character set to use based on the operating
system setting, such as the value of the LANG
or LC_ALL
locale environment variable on Unix systems or the code page setting on Windows systems. For systems on
which the locale is available from the OS, the client uses it to set the default character set rather
than using the compiled-in default. For example, setting LANG
to ru_RU.KOI8-R
causes the koi8r
character set
to be used. Thus, users can configure the locale in their environment for use by MySQL clients.
The OS character set is mapped to the closest MySQL character set if there is no exact match. If the
client does not support the matching character set, it uses the compiled-in default. For example,
ucs2
is not supported as a connection character set.
C applications can use character set autodetection based on the OS setting by invoking mysql_options()
as follows before connecting to the server:
mysql_options(mysql, MYSQL_SET_CHARSET_NAME, MYSQL_AUTODETECT_CHARSET_NAME);
The programs support a --default-character-set
option, which enables users to specify the
character set explicitly to override whatever default the client otherwise determines.
When a client connects to the server, it sends the name of the character set that it wants to use. The server
uses the name to set the character_set_client
, character_set_results
, and character_set_connection
system variables. In effect, the server performs a
SET NAMES
operation using the character set name.
With the mysql client, to use a character set different from the default,
you could explicitly execute SET NAMES
every time you start up. To accomplish the
same result more easily, add the --default-character-set
option setting to your mysql command line or in your option file. For example, the
following option file setting changes the three connection-related character set variables set to koi8r
each time you invoke mysql:
[mysql]default-character-set=koi8r
If you are using the mysql client with auto-reconnect enabled (which is not
recommended), it is preferable to use the charset
command rather than SET NAMES
. For example:
mysql> charset
utf8
Charset changed
The charset
command issues a SET NAMES
statement, and
also changes the default character set that mysql uses when it reconnects after the connection has
dropped.
Example: Suppose that column1
is defined as CHAR(5) CHARACTER
SET latin2
. If you do not say SET NAMES
or SET
CHARACTER SET
, then for SELECT column1 FROM t
, the server sends back all the
values for column1
using the character set that the client specified when it
connected. On the other hand, if you say SET NAMES 'latin1'
or SET CHARACTER SET latin1
before issuing the SELECT
statement, the server converts the latin2
values to latin1
just before sending results back. Conversion may be lossy if there
are characters that are not in both character sets.
If you want the server to perform no conversion of result sets or error messages, set character_set_results
to NULL
or binary
:
SET character_set_results = NULL;
To see the values of the character set and collation system variables that apply to your connection, use these statements:
SHOW VARIABLES LIKE 'character_set%';SHOW VARIABLES LIKE 'collation%';
You must also consider the environment within which your MySQL applications execute. See Section 10.1.5, "Configuring the Character Set and Collation for Applications".
For more information about character sets and error messages, see Section 10.1.6, "Character Set for Error Messages".