This section describes issues pertaining to Unicode support that you may face when upgrading to MySQL 5.7 from an older MySQL release. It also provides guidelines for downgrading from MySQL 5.7 to an older release.
In most respects, upgrading to MySQL 5.7 should present few problems with regard to Unicode usage, although there are some potential areas of incompatibility. These are the primary areas of concern:
Consequently, if you want to upgrade tables from
to take advantage of supplementary-character support, it may be necessary to change some column or index
Tables can be converted from
utf8mb4 by using
TABLE. Suppose that a table was originally defined as follows:
CREATE TABLE t1 ( col1 CHAR(10) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL, col2 CHAR(10) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL) CHARACTER SET utf8;
The following statement converts
t1 to use
ALTER TABLE t1 DEFAULT CHARACTER SET utf8mb4, MODIFY col1 CHAR(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL, MODIFY col2 CHAR(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL;
In terms of table content, conversion from
utf8mb4 presents no problems:
For a BMP character,
have identical storage characteristics: same code values, same encoding, same length.
For a supplementary character,
utf8 cannot store the
character at all, while
utf8mb4 requires four bytes to store it. Since
utf8 cannot store the character at all, you do not have any supplementary
utf8 columns and you need not worry about converting
characters or losing data when upgrading
utf8 data from older versions of
In terms of table structure, the catch when converting from
utf8mb4 is that the maximum length of a column or index key is unchanged in terms of
bytes. Therefore, it is smaller in terms of characters
because the maximum length of a character is four bytes instead of three. For the
data types, watch for these things when converting your MySQL tables:
Check all definitions of
utf8 columns and make sure
they will not exceed the maximum length for the storage engine.
Check all indexes on
utf8 columns and make sure they
will not exceed the maximum length for the storage engine. Sometimes the maximum can change due to
storage engine enhancements.
If the preceding conditions apply, you must either reduce the defined length of columns or indexes, or continue
utf8 rather than
Here are some examples where structural changes may be needed:
column can hold up to 255 bytes, so it can hold up to 85 3-byte or 63 4-byte characters. Suppose that
you have a
TINYTEXT column that uses
utf8 but must
be able to contain more than 63 characters. You cannot convert it to
utf8mb4 unless you also change the data type to a longer type such as
InnoDB has a maximum index length of 767 bytes, so for
utf8mb4 columns, you can index a
maximum of 255 or 191 characters, respectively. If you currently have
columns with indexes longer than 191 characters, you will need to index a smaller number of characters.
InnoDB table, these column and index definitions are legal:
col1 VARCHAR(500) CHARACTER SET utf8, INDEX (col1(255))
utf8mb4 instead, the index must be smaller:
col1 VARCHAR(500) CHARACTER SET utf8mb4, INDEX (col1(191))
The preceding types of changes are most likely to be required only if you have very long columns or indexes.
Otherwise, you should be able to convert your tables from
utf8mb4 without problems. You can do this by using
ALTER TABLE as described earlier in this section after upgrading in place to
The following items summarize other potential areas of incompatibility:
Performance of 4-byte UTF-8 (
utf8mb4) is slower than
for 3-byte UTF-8 (
utf8). If you do not want to incur this penalty, continue
SET NAMES 'utf8mb4' causes use of the 4-byte character
set for connection character sets. As long as no 4-byte characters are sent from the server, there
should be no problems. Otherwise, applications that expect to receive a maximum of three bytes per
character may have problems. Conversely, applications that expect to send 4-byte characters must ensure
that the server understands them.
Applications cannot send
utf32 character data to an older server that does not understand them.
For replication, if the character sets that support supplementary characters are
going to be used on the master, all slaves must understand them as well. If you attempt to replicate
from a MySQL 5.7 master to an older slave,
utf8 data will be seen as
utf8 by the slave and should replicate correctly. But you cannot send
Also, keep in mind the general principle that if a table has different definitions on the master and
slave, this can lead to unexpected results. For example, the differences in limitations on index key
length makes it risky to use
utf8 on the master and
utf8mb4 on the slave.
If you have upgraded to MySQL 5.7, and then decide to downgrade back to an older release, these considerations apply:
utf8 data should
present no problems.
Any definitions that refer to the
utf32 character sets will not be recognized by the older server.
For object definitions that refer to the
character set, you can dump them with mysqldump in MySQL 5.7, edit the dump file to change
utf8, and reload the
file in the older server, as long as there are no 4-byte characters in the data. The older server will
utf8 in the dump file object definitions and create new objects that
use the (3-byte)
utf8 character set.