Spec-Zone .ru
спецификации, руководства, описания, API

21.3.5.4. Using Character Sets and Unicode

All strings sent from the JDBC driver to the server are converted automatically from native Java Unicode form to the client character encoding, including all queries sent using Statement.execute(), Statement.executeUpdate(), Statement.executeQuery() as well as all PreparedStatement and CallableStatement parameters with the exclusion of parameters set using setBytes(), setBinaryStream(), setAsciiStream(), setUnicodeStream() and setBlob().

Number of Encodings Per Connection

In MySQL Server 4.1 and higher, Connector/J supports a single character encoding between client and server, and any number of character encodings for data returned by the server to the client in ResultSets.

Prior to MySQL Server 4.1, Connector/J supported a single character encoding per connection, which could either be automatically detected from the server configuration, or could be configured by the user through the useUnicode and characterEncoding properties.

Setting the Character Encoding

The character encoding between client and server is automatically detected upon connection. You specify the encoding on the server using the character_set_server for server versions 4.1.0 and newer, and character_set system variable for server versions older than 4.1.0. The driver automatically uses the encoding specified by the server. For more information, see Section 10.1.3.1, "Server Character Set and Collation".

For example, to use 4-byte UTF-8 character sets with Connector/J, configure the MySQL server with character_set_server=utf8mb4, and leave characterEncoding out of the Connector/J connection string. Connector/J will then autodetect the UTF-8 setting.

To override the automatically detected encoding on the client side, use the characterEncoding property in the URL used to connect to the server.

To allow multiple character sets to be sent from the client, use the UTF-8 encoding, either by configuring utf8 as the default server character set, or by configuring the JDBC driver to use UTF-8 through the characterEncoding property.

When specifying character encodings on the client side, use Java-style names. The following table lists MySQL character set names and the corresponding Java-style names:

Table 21.26. MySQL to Java Encoding Name Translations

MySQL Character Set Name Java-Style Character Encoding Name
ascii US-ASCII
big5 Big5
gbk GBK
sjis SJIS (or Cp932 or MS932 for MySQL Server < 4.1.11)
cp932 Cp932 or MS932 (MySQL Server > 4.1.11)
gb2312 EUC_CN
ujis EUC_JP
euckr EUC_KR
latin1 Cp1252
latin2 ISO8859_2
greek ISO8859_7
hebrew ISO8859_8
cp866 Cp866
tis620 TIS620
cp1250 Cp1250
cp1251 Cp1251
cp1257 Cp1257
macroman MacRoman
macce MacCentralEurope
utf8 UTF-8
ucs2 UnicodeBig

Warning

Do not issue the query set names with Connector/J, as the driver will not detect that the character set has changed, and will continue to use the character set detected during the initial connection setup.