AnalyticDB for MySQL supports full-text search through three SQL functions: MATCH() AGAINST(), MATCH() FUZZY(), and MATCH() PHRASE(). Use these functions to search text columns, rank results by relevance, and highlight matched keywords.
With full-text search, you can:
Search across one or more text columns without creating multi-column indexes
Rank results by relevance score and filter out low-relevance matches
Handle spelling mistakes with fuzzy matching based on edit distance
Match multi-word phrases with positional tolerance using slop
Highlight matched keywords in query results using custom HTML tags
Prerequisites
Before you begin, ensure that you have:
Full-text indexes created on the columns you want to search. For instructions, see Create a full-text index
Sample data
The examples in this topic use a table named tbl_fulltext_demo. Each column has a full-text index using a different analyzer.
Run the following statements to create the table and insert sample data:
/* RC_DDL_ENGINE_REWRITE_XUANWUV2=false */
CREATE TABLE `tbl_fulltext_demo` (
`id` INT,
`content` VARCHAR,
`content_alinlp` VARCHAR,
`content_ik` VARCHAR,
`content_standard` VARCHAR,
`content_ngram` VARCHAR,
FULLTEXT INDEX fidx_c(`content`), -- Default analyzer
FULLTEXT INDEX fidx_alinlp(`content_alinlp`) WITH ANALYZER alinlp, -- AliNLP analyzer
FULLTEXT INDEX fidx_ik(`content_ik`) WITH ANALYZER ik, -- IK analyzer
FULLTEXT INDEX fidx_standard(`content_standard`) WITH ANALYZER standard, -- Standard analyzer
FULLTEXT INDEX fidx_ngram(`content_ngram`) WITH ANALYZER ngram, -- Ngram analyzer
PRIMARY KEY (`id`)
) DISTRIBUTED BY HASH(id);
INSERT INTO tbl_fulltext_demo(id, content, content_alinlp, content_ik, content_standard, content_ngram)
VALUES
(1, 'Customers Need Better Products and Services', 'Customers Need Better Products and Services', 'Customers Need Better Products and Services', 'Customers Need Better Products and Services', 'Customers Need Better Products and Services'),
(2, 'Wuhan Changjiang Bridge', 'Wuhan Changjiang Bridge', 'Wuhan Changjiang Bridge', 'Wuhan Changjiang Bridge', 'Wuhan Changjiang Bridge'),
(3, 'Hangzhou, Zhejiang Province', 'Hangzhou, Zhejiang Province', 'Hangzhou, Zhejiang Province', 'Hangzhou, Zhejiang Province', 'Hangzhou, Zhejiang Province'),
(4, 'User Values and Commercial Values of Products', 'User Values and Commercial Values of Products', 'User Values and Commercial Values of Products', 'User Values and Commercial Values of Products', 'User Values and Commercial Values of Products');Usage notes
Full-text index functions support the following special characters, but you must escape them with
\\:+-&|!(){}[]^"~*?:\/For example, to search forSpring/Scenery:Incorrect:
MATCH(content) AGAINST('Spring / Scenery')Correct:
MATCH(content) AGAINST('Spring \\/ Scenery')
Full-text index functions do not support operators such as
=,!=,BETWEEN,IS NULL,IS NOT NULL, andLIKE.
Choose a function
| Function | Use when |
|---|---|
MATCH() AGAINST() | Searching for exact or word-matched keywords, with optional Boolean logic |
MATCH() FUZZY() | Keywords may be misspelled or approximate |
MATCH() PHRASE() | Multiple keywords must appear together, with positional tolerance |
MATCH() AGAINST()
MATCH() AGAINST() searches one or more columns for rows that match a keyword. It supports word match (default) and exact match (with quoted keywords), as well as Boolean queries using logical operators.
Syntax
SELECT * FROM `table_name` WHERE MATCH (column_name[, ...]) AGAINST('term')Parameters
| Parameter | Description |
|---|---|
table_name | The table to search |
column_name | The column to search. Separate multiple columns with commas |
term | The search keyword. Supports AND, OR, and NOT logical operators (case-insensitive) |
Logical operators in `term`:
| Operator | Behavior |
|---|---|
AND | Returns rows that match all keywords |
OR | Returns rows that match any keyword |
NOT | Returns rows that match the keyword to the left but not the keyword to the right |
Relevance scores
MATCH() AGAINST() assigns each result a relevance score. Higher scores indicate stronger matches.
Use MATCH() AGAINST() > threshold in the WHERE clause to filter results by score. For example, > 0.9 returns only the top 10% of results by relevance.
Results are not sorted by relevance score by default. Use ORDER BY score DESC to sort explicitly.Example 1: Single-column query
SELECT id, content
FROM `tbl_fulltext_demo`
WHERE MATCH(`content`) AGAINST('Products Services');Result:
+------+-----------------------------------------------+
| id | content |
+------+-----------------------------------------------+
| 4 | User Values and Commercial Values of Products |
| 1 | Customers Need Better Products and Services |
+------+-----------------------------------------------+Example 2: Multi-column query
To search multiple columns, each column must have its own full-text index. No multi-column index is required.
SELECT id, content, content_alinlp
FROM `tbl_fulltext_demo`
WHERE MATCH(content, content_alinlp) AGAINST('Services');The following query is equivalent:
SELECT id, content, content_alinlp
FROM `tbl_fulltext_demo`
WHERE MATCH(content) AGAINST('Services')
OR MATCH(content_alinlp) AGAINST('Services');Result:
+------+---------------------------------------------+---------------------------------------------+
| id | content | content_alinlp |
+------+---------------------------------------------+---------------------------------------------+
| 1 | Customers Need Better Products and Services | Customers Need Better Products and Services |
+------+---------------------------------------------+---------------------------------------------+Example 3: Boolean query
Use logical operators to construct Boolean queries.
AND — matches all keywords:
SELECT * FROM `tbl_fulltext_demo`
WHERE MATCH(content) AGAINST('Products AND Services');OR — matches any keyword:
SELECT * FROM `tbl_fulltext_demo`
WHERE MATCH(content) AGAINST('Products OR Services');NOT — matches the left keyword but excludes the right:
SELECT * FROM `tbl_fulltext_demo`
WHERE MATCH(content) AGAINST('Products NOT Services');NOT with wildcard — excludes a keyword from all rows:
SELECT * FROM `tbl_fulltext_demo`
WHERE MATCH(content) AGAINST('*:* NOT Services');*:* matches all rows. It can only be used to the left of NOT and must appear at the beginning of the term value.
Parentheses — construct complex Boolean queries:
SELECT id, content_alinlp
FROM `tbl_fulltext_demo`
WHERE MATCH(content_alinlp) AGAINST('(Wuhan OR Hangzhou) AND (Bridge OR Xihu)');Result:
+------+---------------------+
| id | content_alinlp |
+------+---------------------+
| 2 | Wuhan Changjiang Bridge |
+------+---------------------+Example 4: Filter by relevance score
To return only the most relevant results, add a score threshold to the WHERE clause. The following query filters out the bottom 90% of results by relevance and returns the top 10%:
SELECT id, content
FROM `tbl_fulltext_demo`
WHERE MATCH(content) AGAINST('Products Services') > 0.9;Result:
+------+-----------------------------------------------+
| id | content |
+------+-----------------------------------------------+
| 4 | User Values and Commercial Values of Products |
| 1 | Customers Need Better Products and Services |
+------+-----------------------------------------------+Example 5: Sort results by relevance score
To retrieve and sort by relevance scores, include MATCH() AGAINST() in the SELECT projection and use ORDER BY.
SELECT id, content, MATCH(content) AGAINST('Products Services') AS score
FROM `tbl_fulltext_demo`
WHERE MATCH(content) AGAINST('Products Services') > 0.9
ORDER BY score DESC;Result:
+------+-----------------------------------------------+--------------------+
| id | content | score |
+------+-----------------------------------------------+--------------------+
| 1 | Customers Need Better Products and Services | 0.2615291476249695 |
| 4 | User Values and Commercial Values of Products | 0.13076457381248474 |
+------+-----------------------------------------------+--------------------+TheMATCH() AGAINST()expression inSELECTand the one inWHEREcan use different keywords. In the following example, theWHEREclause filters by'Products Services', while theSELECTprojection scores by'Wuhan':
SELECT *, MATCH(content) AGAINST('Wuhan') AS score
FROM `tbl_fulltext_demo`
WHERE MATCH(content) AGAINST('Products Services') > 0.9
ORDER BY score DESC;Example 6: Word match
By default, the analyzer segments the keyword into individual words before searching. Rows matching any of the words are returned.
The AliNLP analyzer segments 'Products Services' into Products and Services. The following query returns rows that contain either word:
SELECT id, content_alinlp, MATCH(content_alinlp) AGAINST('Products Services') AS score
FROM `tbl_fulltext_demo`
WHERE MATCH(content_alinlp) AGAINST('Products Services') > 0.9
ORDER BY score DESC;Result:
+------+-----------------------------------------------+--------------------+
| id | content_alinlp | score |
+------+-----------------------------------------------+--------------------+
| 1 | Customers Need Better Products and Services | 0.2615291476249695 |
| 4 | User Values and Commercial Values of Products | 0.13076457381248474 |
+------+-----------------------------------------------+--------------------+Example 7: Exact match
Enclose the keyword in double quotation marks (") to disable word segmentation. Only rows that exactly match the full keyword phrase are returned.
Search for the exact phrase "product service" (no match expected):
SELECT id, content_alinlp, MATCH(content_alinlp) AGAINST('"product service"') AS score
FROM `tbl_fulltext_demo`
WHERE MATCH(content_alinlp) AGAINST('"product service"') > 0.9
ORDER BY score DESC;Result: Empty set
Search for the exact phrase "Products and Services":
SELECT id, content_alinlp, MATCH(content_alinlp) AGAINST('"Products and Services"') AS score
FROM `tbl_fulltext_demo`
WHERE MATCH(content_alinlp) AGAINST('"Products and Services"') > 0.9
ORDER BY score DESC;Result:
+------+-----------------------------------------------+--------------------+
| id | content_alinlp | score |
+------+-----------------------------------------------+--------------------+
| 1 | Customers Need Better Products and Services | 0.3922937512397766 |
+------+-----------------------------------------------+--------------------+MATCH() FUZZY()
MATCH() FUZZY() searches for rows where the indexed text is within a specified Levenshtein edit distance of the keyword. This is useful when the search keyword may be misspelled.
Syntax
SELECT * FROM `table_name` WHERE MATCH (`column_name`) FUZZY('term') [max_edits(n)]Parameters
| Parameter | Description |
|---|---|
table_name | The table to search |
column_name | The column to search |
term | The search keyword |
max_edits(n) | (Optional) The maximum edit distance. An edit is a single insertion, deletion, or substitution. Default: 2. Valid values: 0, 1, 2. For example, changing windos to windows requires one insertion, so the edit distance is 1. |
Example 1: Fuzzy search with default edit distance
Search for hangzou (misspelled), which matches Hangzhou within 2 edits:
SELECT id, content
FROM tbl_fulltext_demo
WHERE MATCH(content_standard) FUZZY('hangzou');Result:
+------+-----------------------------+
| id | content |
+------+-----------------------------+
| 3 | Hangzhou, Zhejiang Province |
+------+-----------------------------+Example 2: Fuzzy search with max_edits(1)
Limit matches to within 1 edit:
SELECT id, content
FROM tbl_fulltext_demo
WHERE MATCH(content_standard) FUZZY('hangzou') max_edits(1);Result:
+------+-----------------------------+
| id | content |
+------+-----------------------------+
| 3 | Hangzhou, Zhejiang Province |
+------+-----------------------------+Example 3: Fuzzy search with max_edits(2)
SELECT id, content
FROM tbl_fulltext_demo
WHERE MATCH(content_standard) FUZZY('Wuhan Chang') max_edits(2);Result:
+------+-----------------------+
| id | content |
+------+-----------------------+
| 2 | Wuhan Changjiang Bridge |
+------+-----------------------+MATCH() PHRASE()
MATCH() PHRASE() searches for rows where a specific column contains a sequence of keywords in a given order. Use the slop parameter to allow some positional flexibility between the keywords.
Syntax
SELECT * FROM `table_name` WHERE MATCH (`column_name`) PHRASE('term1 term2') [slop(n)]Parameters
| Parameter | Description |
|---|---|
table_name | The table to search |
column_name | The column to search |
term1 term2 | The keywords to match, separated by spaces. Keyword order affects results |
slop(n) | (Optional) The maximum number of positional moves allowed to match the keyword sequence. Default: 0. Valid values: integers from 0 to 6. |
Understanding slop:
After tokenization, each word is assigned a position starting from 0. The slop value specifies how many positions a word can be moved to match the specified sequence.
For example, the Standard analyzer tokenizes "Merchandise and Services" as:
| Word | Position |
|---|---|
| Merchandise | 0 |
| and | 1 |
| Services | 2 |
PHRASE('Merchandise and'): slop distance is 1 (move one position to align)PHRASE('Merchandise Services'): slop distance is 2 (skip overand)
Example 1: Phrase search
The Standard analyzer tokenizes "Hangzhou, Zhejiang Province" into Hangzhou, Zhejiang, and Province.
Search for zhejiang hangzhou (reversed order, slop 0 — no match expected):
SELECT id, content
FROM tbl_fulltext_demo
WHERE MATCH(content_standard) PHRASE('zhejiang hangzhou');Result: Empty set
Search for hangzhou Province with slop 1 (one word between them):
SELECT id, content
FROM tbl_fulltext_demo
WHERE MATCH(content_standard) PHRASE('hangzhou Province') slop(1);Result:
+------+-----------------------------+
| id | content |
+------+-----------------------------+
| 3 | Hangzhou, Zhejiang Province |
+------+-----------------------------+Search for zhejiang hangzhou with slop 2 (reversed order, within 2 moves):
SELECT id, content
FROM tbl_fulltext_demo
WHERE MATCH(content_standard) PHRASE('zhejiang hangzhou') slop(2);Result:
+------+-----------------------------+
| id | content |
+------+-----------------------------+
| 3 | Hangzhou, Zhejiang Province |
+------+-----------------------------+Example 2: Phrase search with Chinese text
In Chinese phrase matching scenarios, use the Standard analyzer or the IK analyzer. The AliNLP analyzer and the Ngram analyzer may produce unexpected results.
Search for Products Services with slop 1 using the IK analyzer:
SELECT id, content
FROM tbl_fulltext_demo
WHERE MATCH(content_ik) PHRASE('Products Services') slop(1);Result:
+------+---------------------------------------------+
| id | content |
+------+---------------------------------------------+
| 1 | Customers Need Better Products and Services |
+------+---------------------------------------------+Search for Products Services with slop 2 using the Standard analyzer:
SELECT id, content
FROM tbl_fulltext_demo
WHERE MATCH(content_standard) PHRASE('Products Services') slop(2);Result:
+------+---------------------------------------------+
| id | content |
+------+---------------------------------------------+
| 1 | Customers Need Better Products and Services |
+------+---------------------------------------------+Highlight keywords
Use the fulltext_highlight(column_name) function to highlight matched keywords in the result. By default, matched terms are wrapped in <em> and </em> tags.
Use fulltext_highlight() with MATCH() AGAINST()
Single-column query with keyword highlighting:
SELECT MATCH(content_alinlp) AGAINST('Wuhan Changjiang') AS score,
fulltext_highlight(content_alinlp)
FROM tbl_fulltext_demo
WHERE MATCH(content_alinlp) AGAINST('Wuhan Changjiang') > 0.9
ORDER BY score DESC
LIMIT 3;WHERE MATCH(content_alinlp) AGAINST('Wuhan Changjiang') > 0.9 filters out the bottom 90% of results by relevance and returns the top 10%.
Result:
+--------------------+-----------------------------------------+
| score | fulltext_highlight(content_alinlp) |
+--------------------+-----------------------------------------+
| 0.4413304328918457 | <em>Wuhan</em> <em>Changjiang</em> Bridge |
+--------------------+-----------------------------------------+Multi-column query with keyword highlighting:
SELECT MATCH(content_alinlp) AGAINST('Wuhan Changjiang') AS score,
fulltext_highlight(content_alinlp)
FROM tbl_fulltext_demo
WHERE MATCH(content_alinlp) AGAINST('Wuhan Changjiang') > 0.9
AND MATCH(content_alinlp) AGAINST('Bridge') > 0.9
ORDER BY score DESC
LIMIT 3;Use fulltext_highlight() with MATCH() FUZZY()
SELECT id,
MATCH(content_standard) FUZZY('Wuhan Chang') AS score,
fulltext_highlight(content_standard)
FROM tbl_fulltext_demo
WHERE MATCH(content_standard) FUZZY('Wuhan Chang');Result:
+------+-------+--------------------------------------------------+
| id | score | fulltext_highlight(content_standard) |
+------+-------+--------------------------------------------------+
| 2 | 0.0 | <em>Wuhan</em> Chang<em>jiang</em> Bridge |
+------+-------+--------------------------------------------------+Use fulltext_highlight() with MATCH() PHRASE()
SELECT id,
MATCH(content_ik) PHRASE('Products Services') slop(1) AS score,
fulltext_highlight(content_ik)
FROM tbl_fulltext_demo
WHERE MATCH(content_ik) PHRASE('Products Services') slop(1);Result:
+------+--------------------+------------------------------------------------------------------+
| id | score | fulltext_highlight(content_ik) |
+------+--------------------+------------------------------------------------------------------+
| 1 | 0.1692247390747070 | Customers Need Better <em>Products</em> and <em>Services</em> |
+------+--------------------+------------------------------------------------------------------+Add custom highlight tags
To use custom HTML tags instead of <em> and </em>, set fulltext_highlight_pre_tag and fulltext_highlight_post_tag in a query hint.
With MATCH() AGAINST():
/*+ fulltext_highlight_pre_tag=<strong> fulltext_highlight_post_tag=</strong> */
SELECT MATCH(content_alinlp) AGAINST('Wuhan Changjiang') AS score,
fulltext_highlight(content_alinlp)
FROM tbl_fulltext_demo
WHERE MATCH(content_alinlp) AGAINST('Wuhan Changjiang') > 0.9
ORDER BY score DESC
LIMIT 3;WHERE MATCH(content_alinlp) AGAINST('Wuhan Changjiang') > 0.9 filters out the bottom 90% of results by relevance and returns the top 10%.
Result:
+--------------------+-------------------------------------------------+
| score | fulltext_highlight(content_alinlp) |
+--------------------+-------------------------------------------------+
| 0.2615291476249695 | <strong>Wuhan</strong> <strong>Changjiang</strong> Bridge |
+--------------------+-------------------------------------------------+With MATCH() FUZZY():
/*+ fulltext_highlight_pre_tag=<strong> fulltext_highlight_post_tag=</strong> */
SELECT MATCH(content_alinlp) FUZZY('Wuhan Changjiang') AS score,
fulltext_highlight(content_alinlp)
FROM tbl_fulltext_demo
WHERE MATCH(content_alinlp) FUZZY('Wuhan Changjiang') > 0.9
ORDER BY score DESC
LIMIT 3;WHERE MATCH(content_alinlp) FUZZY('Wuhan Changjiang') > 0.9 filters out the bottom 90% of results by relevance and returns the top 10%.
Result:
+--------------------+-------------------------------------------------+
| score | fulltext_highlight(content_alinlp) |
+--------------------+-------------------------------------------------+
| 0.0 | <strong>Wuhan</strong> Chang<strong>jiang</strong> Bridge |
+--------------------+-------------------------------------------------+With MATCH() PHRASE():
/*+ fulltext_highlight_pre_tag=<strong> fulltext_highlight_post_tag=</strong> */
SELECT MATCH(content_alinlp) PHRASE('Wuhan') AS score,
fulltext_highlight(content_alinlp)
FROM tbl_fulltext_demo
WHERE MATCH(content_alinlp) PHRASE('Wuhan') > 0.9
ORDER BY score DESC
LIMIT 3;WHERE MATCH(content_alinlp) PHRASE('Wuhan') > 0.9 filters out the bottom 90% of results by relevance and returns the top 10%.
Result:
+---------------------+---------------------------------------------------------------+
| score | fulltext_highlight(content_alinlp) |
+---------------------+---------------------------------------------------------------+
| 0.13076457381248474 | <strong>Wuhan</strong> Changjiang Bridge |
+---------------------+---------------------------------------------------------------+