Spec-Zone .ru
спецификации, руководства, описания, API
|
As of MySQL 5.6.5, the Performance Schema maintains statement digest information. Digesting converts a SQL statement to normalized form and computes a hash value for the result. Normalization permits statements that are similar to be grouped and summarized to expose information about the types of statements the server is executing and how often they occur. This section describes how statement normalizing occurs and how it can be useful.
Statement digesting involves these Performance Schema components:
A statement_digest
consumer in the setup_consumers
table controls whether the Performance Schema maintains
digest information.
The statement event tables (events_statements_current
, events_statements_history
, and events_statements_history_long
) have DIGEST
and DIGEST_TEXT
columns that contain digest MD5 values and the
corresponding normalized statement text strings.
A events_statements_summary_by_digest
table provides aggregated statement digest information.
Normalizing a statement transforms the statement text to a more standardized string representation that preserves the general statement structure while removing information not essential to the structure. Object identifiers such as database and table names are preserved. Values and comments are removed, and whitespace is adjusted. The Performance Schema does not retain information such as names, passwords, dates, and so forth.
Consider these statements:
SELECT * FROM orders WHERE customer_id=10 AND quantity>20SELECT * FROM orders WHERE customer_id = 20 AND quantity > 100
To normalize these statements, the Performance Schema replaces data values by ?
and
adjusts whitespace. Both statements yield the same normalized form and thus are considered "the same":
SELECT * FROM orders WHERE customer_id = ? AND quantity > ?
The normalized statement contains less information but is still representative of the original statement. Other similar statements that have different comparison values have the same normalized form.
Now consider these statements:
SELECT * FROM customers WHERE customer_id = 1000SELECT * FROM orders WHERE customer_id = 1000
In this case, the statements are not "the same." The object identifiers differ, so the statements yield different normalized forms:
SELECT * FROM customers WHERE customer_id = ?SELECT * FROM orders WHERE customer_id = ?
Normalized statements have a fixed length. The maximum length of a DIGEST_TEXT
value is 1024 bytes. There is no option to change this maximum. If normalization produces a statement that
exceeds this length, the text ends with "...". Long
statements that differ only in the part that occurs following the "..." are considered to be the same. Consider these statements:
SELECT * FROM mytable WHERE cola = 10 AND colb = 20SELECT * FROM mytable WHERE cola = 10 AND colc = 20
If the cutoff happened to be right after the AND
, both statements would have this
normalized form:
SELECT * FROM mytable WHERE cola = ? AND ...
In this case, the difference in the second column name is lost and both statements are considered the same.
For each normalized statement, the Performance Schema computes a hash digest value and stores that value and the
statement in the DIGEST
and DIGEST_TEXT
columns of the
statement event tables (events_statements_current
,
events_statements_history
,
and events_statements_history_long
).
In addition, information for statements with the same SCHEMA_NAME
and DIGEST
values are aggregated in the events_statements_summary_by_digest
summary table. The Performance Schema
uses MD5 hash values because they are fast to compute and have a favorable statistical distribution that
minimizes collisions.
The events_statements_summary_by_digest
summary table has a fixed size, so when it
becomes full, statements that have SCHEMA_NAME
and DIGEST
values not matching existing values in the table are grouped in a special
row with SCHEMA_NAME
and DIGEST
set to NULL
. This permits all statements to be counted. However, if the special row
accounts for a significant percentage of the statements executed, it might be desirable to increase the size of
the summary table. To do this, set the performance_schema_digests_size
system variable to a larger value at server
startup. If no performance_schema_digests_size
value is given, the server estimates the
value to use at startup. (Before MySQL 5.6.9, there is no SCHEMA_NAME
column and
the special row has DIGEST
set to NULL
.)
The statement digest summary table provides a profile of the statements executed by the server. It shows what kinds of statements an application is executing and how often. An application developer can use this information together with other information in the table to assess the application's performance characteristics. For example, table columns that show wait times, lock times, or index use may highlight types of queries that are inefficient. This gives the developer insight into which parts of the application need attention.