Custom dictionaries let you control how AnalyticDB for MySQL segments text in full-text indexes. Use them to keep domain-specific terms intact (entities) or to exclude high-frequency noise words (stop words). The workflow has three steps: create a dictionary table, add entries, then attach the dictionary to a full-text index.
Prerequisites
Before you begin, ensure that you have:
DDL and DML permissions on the custom dictionary (required to create or update it)
SELECT permission on the custom dictionary (required to use it in queries)
Overview
A custom dictionary is a table with two fixed columns: value (the word) and type (entity or stop word). Because it behaves like a standard table, you manage it with the same SQL operations — INSERT to add words and DELETE to remove them. Changes take effect immediately: any data subsequently written to indexed tables is segmented using the updated dictionary.
Two word types control tokenization behavior:
`main` — marks a term as an entity that the tokenizer must not split. Use this for domain-specific terms that standard tokenization would break apart, such as place names and organization names in the general field, or brands, products, and models in e-commerce scenarios.
`stop` — marks a term to be excluded from the index. Use this for high-frequency words with no search value, such as
fromin SQL audit log scenarios.
Stop words require AnalyticDB for MySQL V3.1.4.24 or later. To check your cluster version, log in to the AnalyticDB for MySQL console and go to the Configuration Information section of the Cluster Information page. To upgrade, see Update the minor version of a cluster.
Limits
DDL changes on custom dictionaries are not supported.
UPDATE and TRUNCATE operations are not supported on custom dictionaries.
A custom dictionary must be used together with a full-text index.
Delete the associated custom dictionary before deleting a full-text index.
Each cluster supports one custom dictionary by default. Contact technical support to create additional dictionaries.
Each dictionary supports up to 10,000 entries by default. Contact technical support to raise this limit.
Create a custom dictionary
Syntax
CREATE TABLE [IF NOT EXISTS] table_name (
`value` VARCHAR NOT NULL COMMENT 'column_comment',
`type` VARCHAR NOT NULL [DEFAULT 'main|stop' COMMENT 'column_comment'],
PRIMARY KEY (`value`, `type`)
) COMMENT 'table_comment'
FULLTEXT_DICT = 'Y',
INDEX_ALL = 'Y|N'Constraints
A custom dictionary can contain only the
valueandtypecolumns.Both columns are VARCHAR and cannot be empty.
The primary key must include both
valueandtype.
Parameters
| Parameter | Description |
|---|---|
table_name | Name of the dictionary table. Must be 1–127 characters and can contain letters, digits, and underscores (_). Must start with a letter or underscore. |
value | Stores the word. The column name is fixed to value. |
type | Stores the word type. The column name is fixed to type. Valid values: main (default) for entities; stop for stop words. |
COMMENT | Comment for the column or table. |
FULLTEXT_DICT | Marks the table as a dictionary table. Set to Y. |
INDEX_ALL | Specifies which columns are indexed. Y indexes all columns; N indexes only the primary key. See the note below for engine-specific defaults. |
`INDEX_ALL` and table engines:
The correct INDEX_ALL setting depends on your cluster's table engine:
Clusters earlier than V3.2.2.0 use the XUANWU engine.
INDEX_ALLdefaults toY. Use the default value.Clusters V3.2.2.0 or later may use XUANWU_V2. Run the following statement to check:
SHOW ADB_CONFIG KEY=RC_DDL_ENGINE_REWRITE_XUANWUV2;If the result is
true, the default engine is XUANWU_V2 andINDEX_ALLdefaults toN. Explicitly setINDEX_ALL = 'Y'when creating dictionary tables on these clusters, or the operation will fail. If the result isfalseand you do not explicitly set the table engine to XUANWU_V2, the default table engine is XUANWU.
Example
Create a dictionary table named tbl_dict_name:
CREATE TABLE tbl_dict_name (
`value` VARCHAR NOT NULL COMMENT 'value of an entity or a stop word',
`type` VARCHAR NOT NULL [DEFAULT 'main' COMMENT 'main: entity (not segmented); stop: excluded from index (V3.1.4.24+)'],
PRIMARY KEY (`value`, `type`)
) COMMENT = 'your dictionary table'
FULLTEXT_DICT = 'Y',
INDEX_ALL = 'Y';Update a custom dictionary
INSERT and DELETE operations take effect immediately. Data written after the update is segmented using the latest dictionary entries.
Examples
Add a stop word:
INSERT INTO `tbl_dict_name` (`value`, `type`) VALUES ('and', 'stop');Remove a word:
DELETE FROM `tbl_dict_name` WHERE `value` = 'and' AND `type` = 'stop';Use a custom dictionary
Specify the dictionary when creating a full-text index in a CREATE TABLE or ALTER TABLE statement.
Syntax
FULLTEXT INDEX idx_name(`column_name`) [ WITH ANALYZER analyzer_name ] [ WITH DICT tbl_dict_name];Parameters
| Parameter | Description |
|---|---|
idx_name | Name of the full-text index. |
column_name | Column on which to create the full-text index. |
WITH ANALYZER analyzer_name | (Optional) Analyzer to use. For supported analyzers, see Analyzers for full-text indexes. |
WITH DICT tbl_dict_name | The custom dictionary to apply. |
Example
Add a full-text index on the content column of tbl_fulltext_demo, using the AliNLP analyzer and the tbl_ext_dict dictionary:
ALTER TABLE `tbl_fulltext_demo`
ADD FULLTEXT INDEX fidx_c(`content`)
WITH ANALYZER alinlp
WITH DICT `tbl_ext_dict`;全文索引可见策略
AnalyticDB for MySQL的全文实现了写入成功即可查询的可见策略。