Spec-Zone .ru
спецификации, руководства, описания, API
|
MySQL supports the following JOIN
syntaxes for the table_references
part of SELECT
statements and multiple-table DELETE
and UPDATE
statements:
table_references:
escaped_table_reference
[,escaped_table_reference
] ...escaped_table_reference
:table_reference
| { OJtable_reference
}table_reference
:table_factor
|join_table
table_factor
:tbl_name
[PARTITION (partition_names
)] [[AS]alias
] [index_hint_list
] |table_subquery
[AS]alias
| (table_references
)join_table
:table_reference
[INNER | CROSS] JOINtable_factor
[join_condition
] |table_reference
STRAIGHT_JOINtable_factor
|table_reference
STRAIGHT_JOINtable_factor
ONconditional_expr
|table_reference
{LEFT|RIGHT} [OUTER] JOINtable_reference
join_condition
|table_reference
NATURAL [{LEFT|RIGHT} [OUTER]] JOINtable_factor
join_condition
: ONconditional_expr
| USING (column_list
)index_hint_list
:index_hint
[,index_hint
] ...index_hint
: USE {INDEX|KEY} [FOR {JOIN|ORDER BY|GROUP BY}] ([index_list
]) | IGNORE {INDEX|KEY} [FOR {JOIN|ORDER BY|GROUP BY}] (index_list
) | FORCE {INDEX|KEY} [FOR {JOIN|ORDER BY|GROUP BY}] (index_list
)index_list
:index_name
[,index_name
] ...
A table reference is also known as a join expression.
In MySQL 5.6.2 and later, a table reference (when it refers to a partitioned table) may contain a PARTITION
option, including a comma-separated list of partitions, subpartitions, or
both. This option follows the name of the table and precedes any alias declaration. The effect of this option is
that rows are selected only from the listed partitions or subpartitions—in other words, any partitions or
subpartitions not named in the list are ignored For more information, see Section
18.5, "Partition Selection".
The syntax of table_factor
is extended in comparison with the SQL
Standard. The latter accepts only table_reference
, not a list of them
inside a pair of parentheses.
This is a conservative extension if we consider each comma in a list of table_reference
items as equivalent to an inner join. For example:
SELECT * FROM t1 LEFT JOIN (t2, t3, t4) ON (t2.a=t1.a AND t3.b=t1.b AND t4.c=t1.c)
is equivalent to:
SELECT * FROM t1 LEFT JOIN (t2 CROSS JOIN t3 CROSS JOIN t4) ON (t2.a=t1.a AND t3.b=t1.b AND t4.c=t1.c)
In MySQL, JOIN
, CROSS JOIN
, and INNER
JOIN
are syntactic equivalents (they can replace each other). In standard SQL, they are not equivalent.
INNER JOIN
is used with an ON
clause, CROSS JOIN
is used otherwise.
In general, parentheses can be ignored in join expressions containing only inner join operations. MySQL also supports nested joins (see Section 8.13.9, "Nested Join Optimization").
Index hints can be specified to affect how the MySQL optimizer makes use of indexes. For more information, see Section 13.2.9.3, "Index Hint Syntax".
The following list describes general factors to take into account when writing joins.
A table reference can be aliased using
or tbl_name
AS alias_name
tbl_name
alias_name
:
SELECT t1.name, t2.salary FROM employee AS t1 INNER JOIN info AS t2 ON t1.name = t2.name;SELECT t1.name, t2.salary FROM employee t1 INNER JOIN info t2 ON t1.name = t2.name;
A table_subquery
is also known as a
subquery in the FROM
clause. Such subqueries must
include an alias to give the subquery result a table name. A trivial example follows; see also Section 13.2.10.8, "Subqueries in the FROM
Clause".
SELECT * FROM (SELECT 1, 2, 3) AS t1;
INNER JOIN
and ,
(comma)
are semantically equivalent in the absence of a join condition: both produce a Cartesian product between
the specified tables (that is, each and every row in the first table is joined to each and every row in
the second table).
However, the precedence of the comma operator is less than of INNER
JOIN
, CROSS JOIN
, LEFT JOIN
, and so
on. If you mix comma joins with the other join types when there is a join condition, an error of the
form Unknown column '
may occur. Information about dealing with this problem is given later in this
section. col_name
' in
'on clause'
The conditional_expr
used with ON
is any conditional expression of the form that can be used in a WHERE
clause. Generally, you should use the ON
clause for conditions that specify how to join tables, and the WHERE
clause
to restrict which rows you want in the result set.
If there is no matching row for the right table in the ON
or USING
part in a LEFT
JOIN
, a row with all columns set to NULL
is used for the right
table. You can use this fact to find rows in a table that have no counterpart in another table:
SELECT left_tbl.* FROM left_tbl LEFT JOIN right_tbl ON left_tbl.id = right_tbl.id WHERE right_tbl.id IS NULL;
This example finds all rows in left_tbl
with an id
value that is not present in right_tbl
(that is, all rows in left_tbl
with no corresponding row in right_tbl
). This assumes that right_tbl.id
is declared NOT NULL
. See
Section 8.13.7, "LEFT
JOIN
and RIGHT JOIN
Optimization".
The USING(
clause names a list of columns that must
exist in both tables. If tables column_list
)a
and b
both
contain columns c1
, c2
, and c3
, the following join compares corresponding columns from the two
tables:
a LEFT JOIN b USING (c1,c2,c3)
The NATURAL [LEFT] JOIN
of two tables is defined to be
semantically equivalent to an INNER JOIN
or a LEFT
JOIN
with a USING
clause that names all columns that exist in
both tables.
RIGHT JOIN
works analogously to LEFT
JOIN
. To keep code portable across databases, it is recommended that you use LEFT JOIN
instead of RIGHT JOIN
.
The { OJ ... }
syntax shown in the join syntax description exists only for
compatibility with ODBC. The curly braces in the syntax should be written literally; they are not
metasyntax as used elsewhere in syntax descriptions.
SELECT left_tbl.* FROM { OJ left_tbl LEFT OUTER JOIN right_tbl ON left_tbl.id = right_tbl.id } WHERE right_tbl.id IS NULL;
You can use other types of joins within { OJ ... }
, such as INNER JOIN
or RIGHT OUTER JOIN
. This helps
with compatibility with some third-party applications, but is not official ODBC syntax.
STRAIGHT_JOIN
is similar to JOIN
, except that the left table is always read before the right table. This
can be used for those (few) cases for which the join optimizer puts the tables in the wrong order.
Some join examples:
SELECT * FROM table1, table2;SELECT * FROM table1 INNER JOIN table2 ON table1.id=table2.id;SELECT * FROM table1 LEFT JOIN table2 ON table1.id=table2.id;SELECT * FROM table1 LEFT JOIN table2 USING (id);SELECT * FROM table1 LEFT JOIN table2 ON table1.id=table2.id LEFT JOIN table3 ON table2.id=table3.id;
Join Processing Changes in MySQL 5.0.12
Natural joins and joins with USING
, including outer join variants, are
processed according to the SQL:2003 standard. The goal was to align the syntax and semantics of MySQL with
respect to NATURAL JOIN
and JOIN ... USING
according to SQL:2003. However, these changes in join processing can result in different output columns for
some joins. Also, some queries that appeared to work correctly in older versions (prior to 5.0.12) must be
rewritten to comply with the standard.
These changes have five main aspects:
The way that MySQL determines the result columns of NATURAL
or USING
join operations (and thus the
result of the entire FROM
clause).
Expansion of SELECT *
and SELECT
into a list of selected columns. tbl_name
.*
Resolution of column names in NATURAL
or USING
joins.
Transformation of NATURAL
or USING
joins into JOIN ... ON
.
Resolution of column names in the ON
condition of a
JOIN ... ON
.
The following list provides more detail about several effects of current join processing versus join processing in older versions. The term "previously" means "prior to MySQL 5.0.12."
The columns of a NATURAL
join or a USING
join may be different from previously. Specifically, redundant output
columns no longer appear, and the order of columns for SELECT *
expansion
may be different from before.
Consider this set of statements:
CREATE TABLE t1 (i INT, j INT);CREATE TABLE t2 (k INT, j INT);INSERT INTO t1 VALUES(1,1);INSERT INTO t2 VALUES(1,1);SELECT * FROM t1 NATURAL JOIN t2;SELECT * FROM t1 JOIN t2 USING (j);
Previously, the statements produced this output:
+------+------+------+------+| i | j | k | j |+------+------+------+------+| 1 | 1 | 1 | 1 |+------+------+------+------++------+------+------+------+| i | j | k | j |+------+------+------+------+| 1 | 1 | 1 | 1 |+------+------+------+------+
In the first SELECT
statement, column j
appears in
both tables and thus becomes a join column, so, according to standard SQL, it should appear only
once in the output, not twice. Similarly, in the second SELECT statement, column j
is named in the USING
clause and
should appear only once in the output, not twice. But in both cases, the redundant column is not
eliminated. Also, the order of the columns is not correct according to standard SQL.
Now the statements produce this output:
+------+------+------+| j | i | k |+------+------+------+| 1 | 1 | 1 |+------+------+------++------+------+------+| j | i | k |+------+------+------+| 1 | 1 | 1 |+------+------+------+
The redundant column is eliminated and the column order is correct according to standard SQL:
First, coalesced common columns of the two joined tables, in the order in which they occur in the first table
Second, columns unique to the first table, in order in which they occur in that table
Third, columns unique to the second table, in order in which they occur in that table
The single result column that replaces two common columns is defined using the coalesce operation.
That is, for two t1.a
and t2.a
the
resulting single join column a
is defined as a =
COALESCE(t1.a, t2.a)
, where:
COALESCE(x, y) = (CASE WHEN V1 IS NOT NULL THEN V1 ELSE V2 END)
If the join operation is any other join, the result columns of the join consists of the concatenation of all columns of the joined tables. This is the same as previously.
A consequence of the definition of coalesced columns is that, for outer joins, the coalesced column
contains the value of the non-NULL
column if one of the two columns is
always NULL
. If neither or both columns are NULL
, both common columns have the same value, so it doesn't matter
which one is chosen as the value of the coalesced column. A simple way to interpret this is to
consider that a coalesced column of an outer join is represented by the common column of the inner
table of a JOIN
. Suppose that the tables t1(a,b)
and t2(a,c)
have the following
contents:
t1 t2---- ----1 x 2 z2 y 3 w
Then:
mysql> SELECT * FROM t1 NATURAL LEFT JOIN
t2;
+------+------+------+| a | b | c |+------+------+------+| 1 | x | NULL || 2 | y | z |+------+------+------+
Here column a
contains the values of t1.a
.
mysql> SELECT * FROM t1 NATURAL RIGHT
JOIN t2;
+------+------+------+| a | c | b |+------+------+------+| 2 | z | y || 3 | w | NULL |+------+------+------+
Here column a
contains the values of t2.a
.
Compare these results to the otherwise equivalent queries with JOIN ...
ON
:
mysql> SELECT * FROM t1 LEFT JOIN t2 ON
(t1.a = t2.a);
+------+------+------+------+| a | b | a | c |+------+------+------+------+| 1 | x | NULL | NULL || 2 | y | 2 | z |+------+------+------+------+
mysql> SELECT * FROM t1 RIGHT JOIN t2 ON
(t1.a = t2.a);
+------+------+------+------+| a | b | a | c |+------+------+------+------+| 2 | y | 2 | z || NULL | NULL | 3 | w |+------+------+------+------+
Previously, a USING
clause could be rewritten as an
ON
clause that compares corresponding columns. For example, the following
two clauses were semantically identical:
a LEFT JOIN b USING (c1,c2,c3)a LEFT JOIN b ON a.c1=b.c1 AND a.c2=b.c2 AND a.c3=b.c3
Now the two clauses no longer are quite the same:
With respect to determining which rows satisfy the join condition, both joins remain semantically identical.
With respect to determining which columns to display for SELECT *
expansion, the two joins are not semantically identical.
The USING
join selects the coalesced value of corresponding
columns, whereas the ON
join selects all columns from all
tables. For the preceding USING
join, SELECT
*
selects these values:
COALESCE(a.c1,b.c1), COALESCE(a.c2,b.c2), COALESCE(a.c3,b.c3)
For the ON
join, SELECT *
selects these values:
a.c1, a.c2, a.c3, b.c1, b.c2, b.c3
With an inner join, COALESCE(a.c1,b.c1)
is the same as either a.c1
or b.c1
because both
columns will have the same value. With an outer join (such as LEFT
JOIN
), one of the two columns can be NULL
. That
column will be omitted from the result.
The evaluation of multi-way natural joins differs in a very important way that
affects the result of NATURAL
or USING
joins
and that can require query rewriting. Suppose that you have three tables t1(a,b)
, t2(c,b)
, and t3(a,c)
that each have one row: t1(1,2)
, t2(10,2)
, and
t3(7,10)
. Suppose also that you have this NATURAL
JOIN
on the three tables:
SELECT ... FROM t1 NATURAL JOIN t2 NATURAL JOIN t3;
Previously, the left operand of the second join was considered to be t2
,
whereas it should be the nested join (t1 NATURAL JOIN t2)
. As a result,
the columns of t3
are checked for common columns only in t2
, and, if t3
has common columns with
t1
, these columns are not used as equi-join columns. Thus, previously,
the preceding query was transformed to the following equi-join:
SELECT ... FROM t1, t2, t3 WHERE t1.b = t2.b AND t2.c = t3.c;
That join is missing one more equi-join predicate (t1.a = t3.a)
. As a
result, it produces one row, not the empty result that it should. The correct equivalent query is
this:
SELECT ... FROM t1, t2, t3 WHERE t1.b = t2.b AND t2.c = t3.c AND t1.a = t3.a;
If you require the same query result in current versions of MySQL as in older versions, rewrite the natural join as the first equi-join.
Previously, the comma operator (,
) and JOIN
both had the same precedence, so the join expression t1, t2 JOIN t3
was interpreted as ((t1, t2) JOIN
t3)
. Now JOIN
has higher precedence, so the expression is
interpreted as (t1, (t2 JOIN t3))
. This change affects statements that use
an ON
clause, because that clause can refer only to columns in the operands
of the join, and the change in precedence changes interpretation of what those operands are.
Example:
CREATE TABLE t1 (i1 INT, j1 INT);CREATE TABLE t2 (i2 INT, j2 INT);CREATE TABLE t3 (i3 INT, j3 INT);INSERT INTO t1 VALUES(1,1);INSERT INTO t2 VALUES(1,1);INSERT INTO t3 VALUES(1,1);SELECT * FROM t1, t2 JOIN t3 ON (t1.i1 = t3.i3);
Previously, the SELECT
was legal due to the implicit grouping of t1,t2
as (t1,t2)
. Now the JOIN
takes precedence, so the operands for the ON
clause are t2
and t3
.
Because t1.i1
is not a column in either of the operands, the result is
an Unknown column 't1.i1' in 'on clause'
error. To allow the join to be
processed, group the first two tables explicitly with parentheses so that the operands for the ON
clause are (t1,t2)
and t3
:
SELECT * FROM (t1, t2) JOIN t3 ON (t1.i1 = t3.i3);
Alternatively, avoid the use of the comma operator and use JOIN
instead:
SELECT * FROM t1 JOIN t2 JOIN t3 ON (t1.i1 = t3.i3);
This change also applies to statements that mix the comma operator with INNER
JOIN
, CROSS JOIN
, LEFT JOIN
, and
RIGHT JOIN
, all of which now have higher precedence than the comma
operator.
Previously, the ON
clause could refer to columns in
tables named to its right. Now an ON
clause can refer only to its operands.
Example:
CREATE TABLE t1 (i1 INT);CREATE TABLE t2 (i2 INT);CREATE TABLE t3 (i3 INT);SELECT * FROM t1 JOIN t2 ON (i1 = i3) JOIN t3;
Previously, the SELECT
statement was legal. Now the statement fails with an Unknown column 'i3' in 'on clause'
error because i3
is a column in t3
, which is not an
operand of the ON
clause. The statement should be rewritten as follows:
SELECT * FROM t1 JOIN t2 JOIN t3 ON (i1 = i3);
Resolution of column names in NATURAL
or USING
joins is different than previously. For column names that are outside
the FROM
clause, MySQL now handles a superset of the queries compared to
previously. That is, in cases when MySQL formerly issued an error that some column is ambiguous, the
query now is handled correctly. This is due to the fact that MySQL now treats the common columns of
NATURAL
or USING
joins as a single column, so
when a query refers to such columns, the query compiler does not consider them as ambiguous.
Example:
SELECT * FROM t1 NATURAL JOIN t2 WHERE b > 1;
Previously, this query would produce an error ERROR 1052 (23000): Column 'b'
in where clause is ambiguous
. Now the query produces the correct result:
+------+------+------+| b | c | y |+------+------+------+| 4 | 2 | 3 |+------+------+------+
One extension of MySQL compared to the SQL:2003 standard is that MySQL enables you to qualify the
common (coalesced) columns of NATURAL
or USING
joins (just as previously), while the standard disallows that.