Spec-Zone .ru
спецификации, руководства, описания, API

10.1.2. Character Sets and Collations in MySQL

The MySQL server can support multiple character sets. To list the available character sets, use the SHOW CHARACTER SET statement. A partial listing follows. For more complete information, see Section 10.1.14, "Character Sets and Collations That MySQL Supports".

mysql> SHOW CHARACTER SET;+----------+-----------------------------+---------------------+--------+| Charset  | Description                 | Default collation   | Maxlen |+----------+-----------------------------+---------------------+--------+| big5     | Big5 Traditional Chinese    | big5_chinese_ci     |      2 || dec8     | DEC West European           | dec8_swedish_ci     |      1 || cp850    | DOS West European           | cp850_general_ci    |      1 || hp8      | HP West European            | hp8_english_ci      |      1 || koi8r    | KOI8-R Relcom Russian       | koi8r_general_ci    |      1 || latin1   | cp1252 West European        | latin1_swedish_ci   |      1 || latin2   | ISO 8859-2 Central European | latin2_general_ci   |      1 || swe7     | 7bit Swedish                | swe7_swedish_ci     |      1 || ascii    | US ASCII                    | ascii_general_ci    |      1 || ujis     | EUC-JP Japanese             | ujis_japanese_ci    |      3 || sjis     | Shift-JIS Japanese          | sjis_japanese_ci    |      2 || hebrew   | ISO 8859-8 Hebrew           | hebrew_general_ci   |      1 || tis620   | TIS620 Thai                 | tis620_thai_ci      |      1 || euckr    | EUC-KR Korean               | euckr_korean_ci     |      2 || koi8u    | KOI8-U Ukrainian            | koi8u_general_ci    |      1 || gb2312   | GB2312 Simplified Chinese   | gb2312_chinese_ci   |      2 || greek    | ISO 8859-7 Greek            | greek_general_ci    |      1 || cp1250   | Windows Central European    | cp1250_general_ci   |      1 || gbk      | GBK Simplified Chinese      | gbk_chinese_ci      |      2 || latin5   | ISO 8859-9 Turkish          | latin5_turkish_ci   |      1 |...

Any given character set always has at least one collation. It may have several collations. To list the collations for a character set, use the SHOW COLLATION statement. For example, to see the collations for the latin1 (cp1252 West European) character set, use this statement to find those collation names that begin with latin1:

mysql> SHOW COLLATION LIKE 'latin1%';+---------------------+---------+----+---------+----------+---------+| Collation           | Charset | Id | Default | Compiled | Sortlen |+---------------------+---------+----+---------+----------+---------+| latin1_german1_ci   | latin1  |  5 |         |          |       0 || latin1_swedish_ci   | latin1  |  8 | Yes     | Yes      |       1 || latin1_danish_ci    | latin1  | 15 |         |          |       0 || latin1_german2_ci   | latin1  | 31 |         | Yes      |       2 || latin1_bin          | latin1  | 47 |         | Yes      |       1 || latin1_general_ci   | latin1  | 48 |         |          |       0 || latin1_general_cs   | latin1  | 49 |         |          |       0 || latin1_spanish_ci   | latin1  | 94 |         |          |       0 |+---------------------+---------+----+---------+----------+---------+

The latin1 collations have the following meanings.

Collation Meaning
latin1_german1_ci German DIN-1
latin1_swedish_ci Swedish/Finnish
latin1_danish_ci Danish/Norwegian
latin1_german2_ci German DIN-2
latin1_bin Binary according to latin1 encoding
latin1_general_ci Multilingual (Western European)
latin1_general_cs Multilingual (ISO Western European), case sensitive
latin1_spanish_ci Modern Spanish

Collations have these general characteristics:

In cases where a character set has multiple collations, it might not be clear which collation is most suitable for a given application. To avoid choosing the wrong collation, it can be helpful to perform some comparisons with representative data values to make sure that a given collation sorts values the way you expect.

Collation-Charts.Org is a useful site for information that shows how one collation compares to another.