This topic describes how to query data by using full-text indexes.

Prerequisites

A table that has full-text indexes is created. In this example, the fulltext_test table has two full-text indexes: t_idx and b_idx.
CREATE TABLE fulltext_test (
    id int,
    title varchar,
    body varchar,
    FULLTEXT INDEX t_idx(title),
    FULLTEXT INDEX b_idx(body),
    PRIMARY KEY (id)
)
DISTRIBUTE BY HASH(id);

Basic query

SELECT *
FROM fulltext_test
WHERE MATCH (title) AGAINST ('Hangzhou');

Parameters:

  • MATCH (COLUMN_NAME[, …]): the name of a column on which a full-text index is created. You can specify one or more column names. For example, MATCH (body) indicates that only the body column is queried. MATCH (title, body) indicates that the title and body columns are queried.
  • AGAINST ('WORDS'): the keywords to be queried. For example, AGAINST ('Hangzhou City in Zhejiang Province') indicates that Hangzhou in Zhejiang province is to be queried.

Multi-column query

SELECT *
FROM fulltext_test
WHERE MATCH (title, body) AGAINST ('Hangzhou');
You can query data of multiple columns when each column to be queried has a full-text index. You do not need to create multi-column indexes. The following statement is equivalent to the preceding one:
SELECT *
FROM fulltext_test
WHERE MATCH (title) AGAINST ('Hangzhou')
    OR MATCH (body) AGAINST ('Hangzhou');

Boolean query (by using logical operators)

The following logical operators are supported when you query data. Logical operators are case-insensitive.
  • AND: indicates that keywords on both sides of the operator must exist.
  • OR: indicates that one of the keywords on both sides of the operator must exist.
  • NOT: indicates that the keyword on the right side of the operator cannot exist.
The following statement can be used to query data:
SELECT *
FROM fulltext_test
WHERE MATCH (title) AGAINST ('Hangzhou AND Zhejiang');

SELECT *
FROM fulltext_test
WHERE MATCH (title) AGAINST ('Hangzhou OR Zhejiang');
The following statement shows how to create a complex boolean query by using parentheses:
SELECT *
FROM fulltext_test
WHERE MATCH (title) AGAINST ('(Hangzhou OR Zhejiang) and (Education or Healthcare)');

Result set filtering

Full-text indexes return all results that approximate queried keywords. In some scenarios where a large amount of data is queried, the result set that hits keywords may also be large in size. However, only results that are highly approximate need to be retrieved. This can be implemented by using the result set filtering feature provided by AnalyticDB for MySQL.

In the following example, WHERE MATCH () AGAINST () > 0.9 indicates that only the top 10% of results that are highly approximate are retrieved. The remaining 90% of results are filtered out.
SELECT *
FROM fulltext_test
WHERE MATCH (title) AGAINST ('Hangzhou City in Zhejiang Province') > 0.9;

Approximate query or sorting by approximate

AnalyticDB for MySQL allows you to obtain approximates of results in a result set and sort the results by approximate.

The following statement can be used to obtain approximates:

SELECT *, MATCH (title) AGAINST ('Hangzhou City in Zhejiang Province') AS score
FROM fulltext_test
WHERE MATCH (title) AGAINST ('Hangzhou City in Zhejiang Province') > 0.9;

By default, results in a returned result set are not sorted by approximate. If you want to sort results by approximate in descending order, you must add the ORDER BY DESC clause.

The following statement can be used to sort results in a result set by approximate in descending order:

SELECT *, MATCH (title) AGAINST ('Hangzhou City in Zhejiang Province') AS score
FROM fulltext_test
WHERE MATCH (title) AGAINST ('Hangzhou City in Zhejiang Province') > 0.9
ORDER BY score DESC;
MATCH (title) AGAINST ('Hangzhou City in Zhejiang Province') in the preceding statement does not need to be the same as those in statements of subsequent queries. For example, you can execute the following statement to query approximates of results obtained from the MATCH (body) AGAINST ('China') statement:
SELECT *, MATCH (body) AGAINST ('China') AS score
FROM fulltext_test
WHERE MATCH (title) AGAINST ('Hangzhou City in Zhejiang Province') > 0.9
ORDER BY score DESC;

Phrase query and exact match

By default, the full-text index feature of AnalyticDB for MySQL V3.0 divides phrases into words and separately queries the words. For example, the phrase People's Republic of China is divided into the following words: People's, Republic, and China. Then, the words are separately queried. Exact match may be required in some special scenarios. For example, only results that completely match the phrase People's Republic of China need to be returned, and results that match the words are not needed. In this case, you can use the syntax for phrase query.

Notice
  • Basic query: Data is queried after phrases are divided into words. Keywords are not enclosed by double quotation marks (").
  • Phrase query: Exact match is supported. Phrases are not divided into words. Keywords are enclosed by double quotation marks (").
Basic query statement:
SELECT *, MATCH (title) AGAINST ('People's Republic of China') AS score
FROM fulltext_test
WHERE MATCH (title) AGAINST ('People's Republic of China') > 0.9
ORDER BY score DESC;

Phrase query statement:

SELECT *, MATCH (title) AGAINST ('"People's Republic of China"') AS score
FROM fulltext_test
WHERE MATCH (title) AGAINST ('"People's Republic of China"') > 0.9
ORDER BY score DESC;

Highlight support

Use a function to highlight data

The full-text index feature of AnalyticDB for MySQL V3.0 allows you to use the fulltext_highlight(column) function to highlight query results. Query statement:

SELECT MATCH (title) AGAINST ('Hangzhou City in Zhejiang Province') AS score, fulltext_highlight(title)
FROM fulltext_test
WHERE MATCH (title) AGAINST ('Hangzhou City in Zhejiang Province') > 0.9
ORDER BY score DESC
LIMIT 5;

Result:

score              | fulltext_highlight(title)                                                              |
+--------------------+------------------------------------------------------------------------------------+
| 15.57516860961914  | Office of the Central Cyberspace Affairs Commission in <em>Hangzhou City</em> of <em>Zhejiang Province</em> Summoned Person in Charge of the WeChat Official Account Er Geng Diner for a Face-to-face Meeting                               |
| 11.763599395751953 | Huayu Fangyuan Project Officially Signed in Lanxi <em>City</em> of <em>Zhejiang Province</em>                                          |
| 11.269683837890625 | Do You Know the Top Five Universities in <em>Zhejiang Province</em>?                                                 |
| 11.153276443481445 | Only Zhejiang University in <em>Zhejiang Province</em> was selected for the national "Double First-Class Initiative". Can we say that university education in <em>Zhejiang Province</em> is poor?                       |
| 10.928305625915527 | Most Suitable Places for Couples to Go in <em>Hangzhou City</em>                                                                                                            |

Use hints to customize highlighting behavior

Query statement:

/*fulltext_highlight_pre_tag=<span>,fulltext_highlight_post_tag=</span>*/
SELECT MATCH (title) AGAINST ('Hangzhou City in Zhejiang Province') AS score, fulltext_highlight(title)
FROM fulltext_test
WHERE MATCH (title) AGAINST ('Hangzhou City in Zhejiang Province') > 0.9
ORDER BY score DESC
LIMIT 5;

If the tag that you create contains equal signs (=) or commas (,), use brackets [] to enclose the tag value, as shown in the following example:

/*fulltext_highlight_pre_tag=[<h3 style="color:blue">],fulltext_highlight_post_tag=</h3>*/
SELECT MATCH (title) AGAINST ('Hangzhou City in Zhejiang Province') AS score, fulltext_highlight(title)
FROM fulltext_test
WHERE MATCH (title) AGAINST ('Hangzhou City in Zhejiang Province') > 0.9
ORDER BY score DESC
LIMIT 5;

Result:

+--------------------+--------------------------------------------------------------------------------------------------------------------------+
| score              | fulltext_highlight(title)                                                                                                |
+--------------------+--------------------------------------------------------------------------------------------------------------------------+
| 15.867133140563965 | <h3 style="color:blue">Office of the Central Cyberspace Affairs Commission in <h3 style="color:blue">Hangzhou City</h3> of <h3 style="color:blue">Zhejiang Province</h3> Summoned Person in Charge of the WeChat Official Account Er Geng Diner for a Face-to-face Meeting               |
| 12.205625534057617 | Huayu Fangyuan Project Officially Signed in Lanxi <h3 style="color:blue">City</h3> of <h3 style="color:blue">Zhejiang Province</h3>                                 |
| 11.646674156188965 | Only Zhejiang University in <h3 style="color:blue">Zhejiang Province</h3> was selected for the national "Double First-Class Initiative". Can we say that university education in <h3 style="color:blue">Zhejiang Province</h3> is poor? |
| 11.338353157043457 | Do You Know the Top Five Universities in <h3 style="color:blue">Zhejiang Province</h3>?                                                       |
| 11.2699556350708   | Why the Economic Center of <h3 style="color:blue">Zhejiang Province</h3> is Located in Ningbo When GDP of <h3 style="color:blue">Hangzhou</h3> Reached CNY 1 Trillion and GDP of Ningbo is CNY 850 Billion? |
+--------------------+--------------------------------------------------------------------------------------------------------------------------+

Highlight data when a column has multiple full-text indexes

If a column has multiple full-text indexes, you can use the following statement to query data:
SELECT MATCH (title) AGAINST ('Hangzhou City in Zhejiang Province') AS score, fulltext_highlight(title)
FROM fulltext_test
WHERE MATCH (title) AGAINST ('Hangzhou City in Zhejiang Province') > 0.9
    AND MATCH (title) AGAINST ('China') > 0.9
ORDER BY score DESC
LIMIT 5;
Both Hangzhou City in Zhejiang Province and China are highlighted. The following results are returned:
+--------------------+----------------------------------------------+
| score              | fulltext_highlight(title)                    |
+--------------------+----------------------------------------------+
| 4.041414260864258  | What Are the Best of <em>China</em> in <em>Hangzhou</em>?        |
| 3.8747754096984863 | Tourism Districts in <em>Hangzhou</em> of <em>China</em>             |
| 3.7213351726531982 | Travel Around <em>China</em>, <em>Hangzhou</em>          |
| 3.7213351726531982 | Food Map of <em>China</em>, <em>Hangzhou</em> |
| 3.641066789627075  | Which <em>Province</em> of <em>China</em> Is Xiangxi in?        |
+--------------------+----------------------------------------------+

Unsupported syntax

The MATCH () AGAINST () filter condition for full-text indexes cannot be placed outside subqueries. Example:
SELECT *
FROM (
    SELECT id, substr(text, 4) AS text
    FROM tbl
) A
WHERE MATCH (A.text) AGAINST ('Hangzhou City in Zhejiang Province');