Spec-Zone .ru
спецификации, руководства, описания, API
|
Metadata is "the data about the data." Anything that describes
the database—as opposed to being the contents of the database—is
metadata. Thus column names, database names, user names, version names, and most of the string results from SHOW
are metadata. This is also true of the contents of tables in INFORMATION_SCHEMA
because those tables by definition contain information about
database objects.
Representation of metadata must satisfy these requirements:
All metadata must be in the same character set. Otherwise, neither the SHOW
statements nor SELECT
statements for tables in INFORMATION_SCHEMA
would work properly because different rows in the same
column of the results of these operations would be in different character sets.
Metadata must include all characters in all languages. Otherwise, users would not be able to name columns and tables using their own languages.
To satisfy both requirements, MySQL stores metadata in a Unicode character set, namely UTF-8. This does not cause any disruption if you never use accented or non-Latin characters. But if you do, you should be aware that metadata is in UTF-8.
The metadata requirements mean that the return values of the USER()
, CURRENT_USER()
, SESSION_USER()
, SYSTEM_USER()
, DATABASE()
, and VERSION()
functions have the UTF-8 character set by default.
The server sets the character_set_system
system variable to the name of the metadata character set:
mysql> SHOW VARIABLES LIKE
'character_set_system';
+----------------------+-------+| Variable_name | Value |+----------------------+-------+| character_set_system | utf8 |+----------------------+-------+
Storage of metadata using Unicode does not mean that the server returns
headers of columns and the results of DESCRIBE
functions in the character_set_system
character set by default. When you use SELECT column1 FROM t
, the name column1
itself is
returned from the server to the client in the character set determined by the value of the character_set_results
system variable, which has a default value of latin1
. If you want the server to pass metadata results back in a different
character set, use the SET NAMES
statement to force the server to perform character
set conversion. SET NAMES
sets the character_set_results
and other related system variables. (See Section 10.1.4, "Connection Character Sets and
Collations".) Alternatively, a client program can perform the conversion after receiving the result from
the server. It is more efficient for the client to perform the conversion, but this option is not always
available for all clients.
If character_set_results
is set to NULL
, no conversion is performed and the server returns metadata using
its original character set (the set indicated by character_set_system
).
Error messages returned from the server to the client are converted to the client character set automatically, as with metadata.
If you are using (for example) the USER()
function for comparison or assignment within a single statement, don't worry. MySQL performs some automatic
conversion for you.
SELECT * FROM t1 WHERE USER() = latin1_column;
This works because the contents of latin1_column
are automatically converted to
UTF-8 before the comparison.
INSERT INTO t1 (latin1_column) SELECT USER();
This works because the contents of USER()
are automatically converted to latin1
before the assignment.
Although automatic conversion is not in the SQL standard, the SQL standard document does say that every character set is (in terms of supported characters) a "subset" of Unicode. Because it is a well-known principle that "what applies to a superset can apply to a subset," we believe that a collation for Unicode can apply for comparisons with non-Unicode strings. For more information about coercion of strings, see Section 10.1.7.5, "Collation of Expressions".