Spec-Zone .ru
спецификации, руководства, описания, API
|
The storage requirements for table data on disk depend on several factors. Different storage engines represent data types and store raw data differently. Table data might be compressed, either for a column or an entire row, complicating the calculation of storage requirements for a table or column.
Despite differences in storage layout on disk, the internal MySQL APIs that communicate and exchange information about table rows use a consistent data structure that applies across all storage engines.
This section includes guidelines and information for the storage requirements for each data type supported by MySQL, including the internal format and size for storage engines that use a fixed-size representation for data types. Information is listed by category or storage engine.
The internal representation of a table has a maximum row size of 65,535 bytes, even if the storage engine is
capable of supporting larger rows. This figure excludes BLOB
or TEXT
columns, which contribute only 9 to 12 bytes toward this size. For BLOB
and TEXT
data, the information is stored internally in a different area of memory
than the row buffer. Different storage engines handle the allocation and storage of this data in different ways,
according to the method they use for handling the corresponding types. For more information, see Chapter
14, Storage Engines, and Section
E.10.4, "Limits on Table Column Count and Row Size".
InnoDB
Tables See Section 14.2.3.12.7, "Physical Row Structure" for
information about storage requirements for InnoDB
tables.
NDBCLUSTER
TablesNDB
tables use 4-byte alignment; all NDB
data storage is done in multiples of 4 bytes. Thus, a column value that would typically take 15 bytes
requires 16 bytes in an NDB
table. For example, in NDB
tables, the TINYINT
, SMALLINT
, MEDIUMINT
, and INTEGER
(INT
) column types each require 4 bytes storage per record due to the
alignment factor.
Each BIT(
column takes M
)M
bits of storage space. Although an individual BIT
column is not 4-byte aligned, NDB
reserves 4 bytes (32 bits) per row for the first 1-32 bits needed for BIT
columns, then another 4 bytes for bits 33-64, and so on.
While a NULL
itself does not require any storage space, NDB
reserves 4 bytes per row if the table definition contains any columns defined as NULL
, up to 32 NULL
columns. (If a MySQL Cluster
table is defined with more than 32 NULL
columns up to 64 NULL
columns, then 8 bytes per row are reserved.)
Every table using the NDB
storage engine requires a primary key; if you do not define a primary key, a
"hidden" primary key is created by NDB
. This hidden primary key consumes 31-35 bytes per table record.
You can use the ndb_size.pl Perl script to estimate NDB
storage requirements. It connects to a current MySQL (not MySQL Cluster)
database and creates a report on how much space that database would require if it used the NDB
storage engine. See Section
17.4.23, "ndb_size.pl — NDBCLUSTER Size Requirement Estimator"
for more information.
Data Type | Storage Required |
---|---|
TINYINT |
1 byte |
SMALLINT |
2 bytes |
MEDIUMINT |
3 bytes |
INT , INTEGER |
4 bytes |
BIGINT |
8 bytes |
FLOAT( |
4 bytes if 0 <= p <= 24, 8 bytes if 25 <= p <= 53
|
FLOAT |
4 bytes |
DOUBLE [PRECISION] , REAL |
8 bytes |
DECIMAL( ,NUMERIC( |
Varies; see following discussion |
BIT( |
approximately (M +7)/8 bytes |
Values for DECIMAL
(and NUMERIC
) columns are represented using a binary format that packs nine decimal
(base 10) digits into four bytes. Storage for the integer and fractional parts of each value are determined
separately. Each multiple of nine digits requires four bytes, and the "leftover"
digits require some fraction of four bytes. The storage required for excess digits is given by the following
table.
Leftover Digits | Number of Bytes |
---|---|
0 | 0 |
1 | 1 |
2 | 1 |
3 | 2 |
4 | 2 |
5 | 3 |
6 | 3 |
7 | 4 |
8 | 4 |
For TIME
,
DATETIME
, and TIMESTAMP
columns, the storage required for tables created before MySQL 5.6.4
differs from tables created from 5.6.4 on. This is due to a change in 5.6.4 that permits these types to have a
fractional part, which requires from 0 to 3 bytes.
Data Type | Storage Required Before MySQL 5.6.4 | Storage Required as of MySQL 5.6.4 |
---|---|---|
YEAR |
1 byte | 1 byte |
DATE |
3 bytes | 3 bytes |
TIME |
3 bytes | 3 bytes + fractional seconds storage |
DATETIME
|
8 bytes | 5 bytes + fractional seconds storage |
TIMESTAMP
|
4 bytes | 4 bytes + fractional seconds storage |
As of MySQL 5.6.4, storage for YEAR
and DATE
remains unchanged. However, TIME
, DATETIME
, and TIMESTAMP
are represented differently. DATETIME
is packed more efficiently, requiring 5 rather than 8 bytes for the
nonfractional part, and all three parts have a fractional part that requires from 0 to 3 bytes, depending on the
fractional seconds precision of stored values.
Fractional Seconds Precision | Storage Required |
---|---|
0 | 0 bytes |
1, 2 | 1 byte |
3, 4 | 2 bytes |
5, 6 | 3 bytes |
For example, TIME(0)
,
TIME(2)
, TIME(4)
, and TIME(6)
use 3, 4, 5, and 6 bytes, respectively. TIME
and TIME(0)
are equivalent and require the same storage.
For details about internal representation of temporal values, see
In the following table, M
represents the declared column length in
characters for nonbinary string types and bytes for binary string types. L
represents the actual length in bytes of a given string value.
Data Type | Storage Required |
---|---|
CHAR( |
M × w bytes, 0
<= 255, where w is the number of bytes required for the
maximum-length character in the character set
|
BINARY( |
M bytes, 0 <= 255
|
VARCHAR( , VARBINARY( |
L + 1 bytes if column values require 0 – 255 bytes, L + 2 bytes if values may require more than 255 bytes
|
TINYBLOB , TINYTEXT |
L + 1 bytes, where L
< 28 |
BLOB , TEXT |
L + 2 bytes, where L
< 216 |
MEDIUMBLOB ,
MEDIUMTEXT |
L + 3 bytes, where L
< 224 |
LONGBLOB , LONGTEXT |
L + 4 bytes, where L
< 232 |
ENUM(' |
1 or 2 bytes, depending on the number of enumeration values (65,535 values maximum) |
SET(' |
1, 2, 3, 4, or 8 bytes, depending on the number of set members (64members maximum) |
Variable-length string types are stored using a length prefix plus data. The length prefix requires from one to
four bytes depending on the data type, and the value of the prefix is L
(the byte length of the string). For example, storage for a MEDIUMTEXT
value requires L
bytes to store the value plus three bytes to store
the length of the value.
To calculate the number of bytes used to store a particular CHAR
, VARCHAR
, or TEXT
column value, you must take into account the character set used for that
column and whether the value contains multi-byte characters. In particular, when using the utf8
(or utf8mb4
) Unicode character set, you must
keep in mind that not all characters use the same number of bytes and can require up to three (four) bytes per
character. For a breakdown of the storage used for different categories of utf8
or
utf8mb4
characters, see Section 10.1.10,
"Unicode Support".
VARCHAR
,
VARBINARY
, and the BLOB
and TEXT
types are variable-length types. For each, the storage requirements depend on these factors:
The actual length of the column value
The column's maximum possible length
The character set used for the column, because some character sets contain multi-byte characters
For example, a VARCHAR(255)
column can hold a string with a maximum length of 255
characters. Assuming that the column uses the latin1
character set (one byte per
character), the actual storage required is the length of the string (L
),
plus one byte to record the length of the string. For the string 'abcd'
, L
is 4 and the storage requirement is five bytes. If the same
column is instead declared to use the ucs2
double-byte character set, the storage
requirement is 10 bytes: The length of 'abcd'
is eight bytes and the column
requires two bytes to store lengths because the maximum length is greater than 255 (up to 510 bytes).
The effective maximum number of bytes that can be stored in a VARCHAR
or VARBINARY
column is subject to the maximum row size of 65,535 bytes, which is
shared among all columns. For a VARCHAR
column that stores multi-byte characters, the effective maximum
number of characters is less. For example, utf8
characters can require up to three bytes per character, so a VARCHAR
column that uses the utf8
character set
can be declared to be a maximum of 21,844 characters. See Section
E.10.4, "Limits on Table Column Count and Row Size".
The NDB
storage engine supports variable-width columns. This means that a VARCHAR
column in a MySQL Cluster table requires the same amount of storage
as would any other storage engine, with the exception that such values are 4-byte aligned. Thus, the string
'abcd'
stored in a VARCHAR(50)
column using the latin1
character set requires 8 bytes (rather than 6 bytes for the same column
value in a MyISAM
table).
TEXT
and BLOB
columns are implemented differently in the NDB
storage engine, wherein each row in a TEXT
column is made up of two separate parts. One of these is of fixed size (256
bytes), and is actually stored in the original table. The other consists of any data in excess of 256 bytes,
which is stored in a hidden table. The rows in this second table are always 2,000 bytes long. This means that
the size of a TEXT
column is 256 if size
<=
256 (where size
represents the size of the row); otherwise, the size
is 256 + size
+ (2000 – (size
– 256) % 2000).
The size of an ENUM
object is determined by the number of different enumeration values. One byte is used for enumerations with up to
255 possible values. Two bytes are used for enumerations having between 256 and 65,535 possible values. See Section 11.4.4, "The ENUM
Type".
The size of a SET
object is determined by the number of different set members. If the set size
is N
, the object occupies (
bytes, rounded up to 1, 2, 3, 4, or 8 bytes. A N
+7)/8SET
can have a maximum of 64 members. See Section
11.4.5, "The SET
Type".