Spec-Zone .ru
спецификации, руководства, описания, API
|
MySQL can perform boolean full-text searches using the IN BOOLEAN MODE
modifier.
With this modifier, certain characters have special meaning at the beginning or end of words in the search
string. In the following query, the +
and -
operators
indicate that a word must be present or absent, respectively, for a match to occur. Thus, the query retrieves
all the rows that contain the word "MySQL" but that do
not contain the word "YourSQL":
mysql>SELECT * FROM articles WHERE MATCH (title,body)
AGAINST ('+MySQL -YourSQL' IN BOOLEAN MODE);
+----+-----------------------+-------------------------------------+| id | title | body |+----+-----------------------+-------------------------------------+| 1 | MySQL Tutorial | DBMS stands for DataBase ... || 2 | How To Use MySQL Well | After you went through a ... || 3 | Optimizing MySQL | In this tutorial we will show ... || 4 | 1001 MySQL Tricks | 1. Never run mysqld as root. 2. ... || 6 | MySQL Security | When configured properly, MySQL ... |+----+-----------------------+-------------------------------------+
In implementing this feature, MySQL uses what is sometimes referred to as implied Boolean logic, in which
+
stands for AND
-
stands for NOT
[no operator] implies OR
Boolean full-text searches have these characteristics:
They do not use the 50% threshold that applies to MyISAM
search indexes.
They do not automatically sort rows in order of decreasing relevance.
Boolean queries against a MyISAM
search index can work
even without a FULLTEXT
index, although a search executed in this fashion
would be quite slow. InnoDB
tables require a FULLTEXT
index on all columns of the MATCH()
expression to perform boolean queries.
The minimum and maximum word length full-text parameters apply: innodb_ft_min_token_size
and innodb_ft_max_token_size
for InnoDB
search
indexes, and ft_min_word_len
and ft_max_word_len
for MyISAM
ones.
The stopword list applies, controlled by innodb_ft_enable_stopword
, innodb_ft_server_stopword_table
, and innodb_ft_user_stopword_table
for InnoDB
search indexes, and ft_stopword_file
for MyISAM
ones.
InnoDB
full-text search does not support the use of
multiple operators on a single search word, as in this example: '++apple'.
.
MyISAM full-text search will successfully process the same search ignoring all operators except for the
operator immediately adjacent to the search word.
The boolean full-text search capability supports the following operators:
+
A leading plus sign indicates that this word must be present in each row that is returned.
-
A leading minus sign indicates that this word must not be present in any of the rows that are returned.
Note: The -
operator acts only to exclude rows that are otherwise
matched by other search terms. Thus, a boolean-mode search that contains only terms preceded by
-
returns an empty result. It does not return "all rows except those containing any of the excluded terms."
(no operator)
By default (when neither +
nor -
is
specified), the word is optional, but the rows that contain it are rated higher. This mimics the
behavior of MATCH() ...
AGAINST()
without the IN BOOLEAN MODE
modifier.
@
distance
This operator works on InnoDB
tables only. It tests whether two or more
words all start within a specified distance from each other, measured in words. Specify the search
words within a double-quoted string immediately before the @
operator, for example, distance
MATCH(col1) AGAINST('"word1 word2 word3" @8' IN BOOLEAN MODE)
> <
These two operators are used to change a word's contribution to the relevance value that is assigned
to a row. The >
operator increases the contribution and the <
operator decreases it. See the example following this list.
( )
Parentheses group words into subexpressions. Parenthesized groups can be nested.
~
A leading tilde acts as a negation operator, causing the word's contribution to the row's relevance
to be negative. This is useful for marking "noise" words. A row containing such a word is rated lower than
others, but is not excluded altogether, as it would be with the -
operator.
*
The asterisk serves as the truncation (or wildcard) operator. Unlike the other operators, it is
appended to the word to be affected. Words match if they
begin with the word preceding the *
operator.
If a word is specified with the truncation operator, it is not stripped from a boolean query, even
if it is too short or a stopword. Whether a word is too short is determined from the innodb_ft_min_token_size
setting for InnoDB
tables, or ft_min_word_len
for MyISAM
tables.
The wildcarded word is considered as a prefix that must be present at the start of one or more
words. If the minimum word length is 4, a search for '+
could return fewer rows than a
search for word
+the*''+
,
because the second query ignores the too-short search term word
+the'the
.
"
A phrase that is enclosed within double quote (""
") characters matches only rows that contain the phrase
literally, as it was typed. The full-text engine splits the
phrase into words and performs a search in the FULLTEXT
index for the
words. Nonword characters need not be matched exactly: Phrase searching requires only that matches
contain exactly the same words as the phrase and in the same order. For example, "test phrase"
matches "test, phrase"
.
If the phrase contains no words that are in the index, the result is empty. The words might not be in the index because of a combination of factors: if they do not exist in the text, are stopwords, or are shorter than the minimum length of indexed words.
The following examples demonstrate some search strings that use boolean full-text operators:
'apple banana'
Find rows that contain at least one of the two words.
'+apple +juice'
Find rows that contain both words.
'+apple macintosh'
Find rows that contain the word "apple", but rank rows higher if they also contain "macintosh".
'+apple -macintosh'
Find rows that contain the word "apple" but not "macintosh".
'+apple ~macintosh'
Find rows that contain the word "apple", but
if the row also contains the word "macintosh",
rate it lower than if row does not. This is "softer"
than a search for '+apple -macintosh'
, for which the presence of "macintosh" causes the row not to be
returned at all.
'+apple +(>turnover <strudel)'
Find rows that contain the words "apple" and "turnover", or "apple" and "strudel" (in any order), but rank "apple turnover" higher than "apple strudel".
'apple*'
Find rows that contain words such as "apple", "apples", "applesauce", or "applet".
'"some words"'
Find rows that contain the exact phrase "some words"
(for example, rows that contain "some words of wisdom"
but not "some noise words"). Note
that the ""
"
characters that enclose the phrase are operator characters that delimit the phrase. They are not the
quotation marks that enclose the search string itself.