Spec-Zone .ru
спецификации, руководства, описания, API
|
If you have existing tables, and applications that use them, that you want to convert to InnoDB
for better reliability and scalability, use the following guidelines and tips. This section assumes most such
tables were originally MyISAM
, which was formerly the default.
MyISAM
, Increase
Memory Usage for InnoDB
As you transition away from MyISAM
tables, lower the value of the key_buffer_size
configuration option to free memory no longer needed for caching
results. Increase the value of the innodb_buffer_pool_size
configuration option, which performs a similar role
of allocating cache memory for InnoDB
tables. The InnoDB
buffer
pool caches both table data and index data, so it does double duty in speeding up lookups for queries
and keeping query results in memory for reuse.
Allocate as much memory to this option as you can afford, often up to 80% of physical memory on the server.
If the operating system runs short of memory for other processes and begins to
swap, reduce the innodb_buffer_pool_size
value. Swapping is such an expensive
operation that it drastically reduces the benefit of the cache memory.
If the innodb_buffer_pool_size
value is several gigabytes or higher, consider
increasing the values of innodb_buffer_pool_instances
. Doing so helps on busy servers where
many connections are reading data into the cache at the same time.
On a busy server, run benchmarks with the Query Cache turned off. The InnoDB
buffer pool provides similar benefits, so the Query Cache might be
tying up memory unnecessarily.
Because MyISAM
tables do not support transactions, you
might not have paid much attention to the autocommit
configuration option and the COMMIT
and ROLLBACK
statements. These keywords are important to allow multiple sessions
to read and write InnoDB
tables concurrently, providing substantial scalability
benefits in write-heavy workloads.
While a transaction is open, the system keeps a snapshot of the data as seen at the beginning of the transaction, which can cause substantial overhead if the system inserts, updates, and deletes millions of rows while a stray transaction keeps running. Thus, take care to avoid transactions that run for too long:
If you are using a mysql session for interactive experiments, always COMMIT
(to finalize the changes) or ROLLBACK
(to undo the changes) when finished. Close down interactive
sessions rather than leaving them open for long periods, to avoid keeping transactions open for long
periods by accident.
Make sure that any error handlers in your application also ROLLBACK
incomplete changes or COMMIT
completed changes.
ROLLBACK
is a relatively expensive operation, because INSERT
, UPDATE
, and DELETE
operations are written to InnoDB
tables prior to the COMMIT
, with the expectation that most changes will be committed
successfully and rollbacks will be rare. When experimenting with large volumes of data, avoid making
changes to large numbers of rows and then rolling back those changes.
When loading large volumes of data with a sequence of INSERT
statements, periodically COMMIT
the results to avoid having transactions that last for hours. In
typical load operations for data warehousing, if something goes wrong, you TRUNCATE TABLE
and start over from the beginning rather than doing a
ROLLBACK
.
The preceding tips save memory and disk space that can be wasted during too-long transactions. When transactions
are shorter than they should be, the problem is excessive I/O. With each COMMIT
, MySQL makes sure each change is safely recorded to disk, which
involves some I/O.
For most operations on InnoDB
tables, you should use
the setting autocommit=0
.
From an efficiency perspective, this avoids unnecessary I/O when you issue large numbers of consecutive
INSERT
, UPDATE
,
or DELETE
statements. From a safety perspective, this allows you to issue a ROLLBACK
statement to recover lost or garbled data if you make a
mistake on the mysql command line, or in an exception handler in
your application.
The time when autocommit=1
is suitable for InnoDB
tables
is when running a sequence of queries for generating reports or analyzing statistics. In this situation,
there is no I/O penalty related to COMMIT
or ROLLBACK
, and InnoDB
can automatically
optimize the read-only workload.
If you make a series of related changes, finalize all those changes at once with a
single COMMIT
at the end. For example, if you insert related pieces of
information into several tables, do a single COMMIT
after making all the changes. Or if you run many consecutive
INSERT
statements, do a single COMMIT
after all the data is loaded; if you are doing millions of INSERT
statements, perhaps split up the huge transaction by issuing a COMMIT
every ten thousand or hundred thousand records, so the
transaction does not grow too large.
Remember that even a SELECT
statement opens a transaction, so after running some report or
debugging queries in an interactive mysql session, either issue a COMMIT
or close the mysql session.
You might see warning messages referring to "deadlocks" in
the MySQL error log, or the output of SHOW ENGINE
INNODB STATUS
. Despite the scary-sounding name, a deadlock is not a
serious issue for InnoDB
tables, and often does not require any corrective action.
When two transactions start modifying multiple tables, accessing the tables in a different order, they can reach
a state where each transaction is waiting for the other and neither can proceed. MySQL immediately detects this
condition and cancels (rolls back) the
"smaller" transaction, allowing the other to proceed.
Your applications do need error-handling logic to restart a transaction that is forcibly cancelled like this. When you re-issue the same SQL statements as before, the original timing issue no longer applies: either the other transaction has already finished and yours can proceed, or the other transaction is still in progress and your transaction waits until it finishes.
If deadlock warnings occur constantly, you might review the application code to reorder the SQL operations in a
consistent way, or to shorten the transactions. You can test with the innodb_print_all_deadlocks
option enabled to see all deadlock warnings in the
MySQL error log, rather than only the last warning in the SHOW ENGINE INNODB STATUS
output.
To get the best performance from InnoDB
tables, you can adjust a number of
parameters related to storage layout.
When you convert MyISAM
tables that are large, frequently accessed, and hold vital
data, investigate and consider the innodb_file_per_table
, innodb_file_format
, and innodb_page_size
configuration options, and the ROW_FORMAT
and KEY_BLOCK_SIZE
clauses of the CREATE TABLE
statement.
During your initial experiments, the most important setting is innodb_file_per_table
. Enabling this option before creating new InnoDB
tables ensures that the InnoDB
system
tablespace files do not allocate disk space permanently for all the InnoDB
data. With innodb_file_per_table
enabled, DROP TABLE
and TRUNCATE TABLE
free disk space as you would expect.
To convert a non-InnoDB
table to use InnoDB
use ALTER
TABLE
:
ALTER TABLE table_name
ENGINE=InnoDB;
Do not convert MySQL system tables in the mysql
database (such as
user
or host
) to the InnoDB
type. This is an unsupported operation. The system tables must always
be of the MyISAM
type.
You might make an InnoDB table that is a clone of a MyISAM table, rather than doing the ALTER TABLE
conversion, to test the old and new table side-by-side before
switching.
Create an empty InnoDB
table with identical column and index definitions. Use show create table
to see the
full table_name
\GCREATE TABLE
statement to use. Change the ENGINE
clause to ENGINE=INNODB
.
To transfer a large volume of data into an empty InnoDB
table created as shown in
the previous section, insert the rows with INSERT INTO
.
innodb_table
SELECT * FROM myisam_table
ORDER BY primary_key_columns
You can also create the indexes for the InnoDB
table after inserting the data.
Historically, creating new secondary indexes was a slow operation for InnoDB, but now you can create the indexes
after the data is loaded with relatively little overhead from the index creation step.
If you have UNIQUE
constraints on secondary keys, you can speed up a table import
by turning off the uniqueness checks temporarily during the import operation:
SET unique_checks=0;... import operation ...
SET unique_checks=1;
For big tables, this saves disk I/O because InnoDB
can use its insert
buffer to write secondary index records as a batch. Be certain that the data contains no duplicate keys. unique_checks
permits but does not require storage engines to ignore duplicate
keys.
To get better control over the insertion process, you might insert big tables in pieces:
INSERT INTO newtable SELECT * FROM oldtable WHERE yourkey >something
AND yourkey <=somethingelse
;
After all records have been inserted, you can rename the tables.
During the conversion of big tables, increase the size of the InnoDB
buffer pool to
reduce disk I/O, to a maximum of 80% of physical memory. You can also increase the sizes of the InnoDB
log files.
By this point, as already mentioned, you should already have the innodb_file_per_table
option enabled, so that if you temporarily make several
copies of your data in InnoDB
tables, you can recover all that disk space by
dropping unneeded tables afterward.
Whether you convert the MyISAM
table directly or create a cloned InnoDB
table, make sure that you have sufficient disk space to hold both the old and
new tables during the process. InnoDB
tables require more disk space than MyISAM
tables. If an ALTER
TABLE
operation runs out of space, it starts a rollback, and that can take hours if it is
disk-bound. For inserts, InnoDB
uses the insert buffer to merge secondary index
records to indexes in batches. That saves a lot of disk I/O. For rollback, no such mechanism is used, and the
rollback can take 30 times longer than the insertion.
In the case of a runaway rollback, if you do not have valuable data in your database, it may be advisable to
kill the database process rather than wait for millions of disk I/O operations to complete. For the complete
procedure, see Section 14.2.4.6, "Starting InnoDB
on a Corrupted Database".
PRIMARY KEY
for Each
Table The PRIMARY KEY
clause is a critical factor affecting the performance of MySQL
queries and the space usage for tables and indexes. Perhaps you have phoned a financial institution where you
are asked for an account number. If you do not have the number, you are asked for a dozen different pieces of
information to "uniquely identify" yourself. The primary
key is like that unique account number that lets you get straight down to business when querying or modifying
the information in a table. Every row in the table must have a primary key value, and no two rows can have the
same primary key value.
Here are some guidelines for the primary key, followed by more detailed explanations.
Declare a PRIMARY KEY
for each table. Typically, it is
the most important column that you refer to in WHERE
clauses when looking
up a single row.
Declare the PRIMARY KEY
clause in the original CREATE TABLE
statement, rather than adding it later through an ALTER TABLE
statement.
Choose the column and its data type carefully. Prefer numeric columns over character or string ones.
Consider using an auto-increment column if there is not another stable, unique, non-null, numeric column to use.
An auto-increment column is also a good choice if there is any doubt whether the value of the primary key column could ever change. Changing the value of a primary key column is an expensive operation, possibly involving rearranging data within the table and within each secondary index.
Consider adding a primary key to any table that does not already have one. Use the smallest practical numeric type based on the maximum projected size of the table. This can make each row slightly more compact, which can yield substantial space savings for large tables. The space savings are multiplied if the table has any secondary indexes, because the primary key value is repeated in each secondary index entry. In addition to reducing data size on disk, a small primary key also lets more data fit into the buffer pool, speeding up all kinds of operations and improving concurrency.
If the table already has a primary key on some longer column, such as a VARCHAR
,
consider adding a new unsigned AUTO_INCREMENT
column and switching the primary key
to that, even if that column is not referenced in queries. This design change can produce substantial space
savings in the secondary indexes. You can designate the former primary key columns as UNIQUE
NOT NULL
to enforce the same constraints as the PRIMARY KEY
clause, that
is, to prevent duplicate or null values across all those columns.
If you spread related information across multiple tables, typically each table uses the same column for its primary key. For example, a personnel database might have several tables, each with a primary key of employee number. A sales database might have some tables with a primary key of customer number, and other tables with a primary key of order number. Because lookups using the primary key are very fast, you can construct efficient join queries for such tables.
If you leave the PRIMARY KEY
clause out entirely, MySQL creates an invisible one
for you. It is a 6-byte value that might be longer than you need, thus wasting space. Because it is hidden, you
cannot refer to it in queries.
The extra reliability and scalability features of InnoDB
do require more disk
storage than equivalent MyISAM
tables. You might change the column and index
definitions slightly, for better space utilization, reduced I/O and memory consumption when processing result
sets, and better query optimization plans making efficient use of index lookups.
If you do set up a numeric ID column for the primary key, use that value to cross-reference with related values
in any other tables, particularly for join
queries. For example, rather than accepting a country name as input and doing queries searching for the same
name, do one lookup to determine the country ID, then do other queries (or a single join query) to look up
relevant information across several tables. Rather than storing a customer or catalog item number as a string of
digits, potentially using up several bytes, convert it to a numeric ID for storing and querying. A 4-byte
unsigned INT
column can index over 4 billion items (with the US meaning of billion:
1000 million). For the ranges of the different integer types, see Section
11.2.1, "Integer Types (Exact Value) - INTEGER
, INT
, SMALLINT
, TINYINT
, MEDIUMINT
, BIGINT
".
InnoDB
TablesInnoDB
files require more care and planning than MyISAM
files do:
You must not delete the ibdata files that represent the
InnoDB
system tablespace.
Copying InnoDB tables from one server to another requires issuing the FLUSH TABLES ... FOR EXPORT
statement first, and copying the
file along with the
table_name
.cfg
file.table_name
.ibd