This topic describes how to use custom analyzers and dictionaries of the full-text index feature.

In most scenarios, the default analysis-aliws plug-in can efficiently perform word analysis without the need to manually perform the operation. You can also use the analysis-ik plug-in. In some special scenarios, you can use custom dictionaries to obtain word analysis results that better meet your business requirements. AnalyticDB for MySQL V3.0 provides flexible custom dictionary capabilities. You can manage custom dictionaries in the same manner as you manage standard tables. Newly written data takes effect in real time. Custom dictionaries provided by AnalyticDB for MySQL V3.0 can be used for indexes. This makes custom dictionaries more flexible to use.

Note Full-text index analyzers support languages such as English. Specific analysis effects are evaluated based on the analyzers that are used.

Custom analyzers

The full-text index feature of AnalyticDB for MySQL V3.0 supports the analysis-aliws and analysis-ik plug-ins. You can specify an analyzer when you create full-text indexes. If you do not specify an analyzer, the analysis-aliws plug-in is used.

Example:

CREATE TABLE fulltext_test (
    id int,
    title varchar,
    body varchar,
    FULLTEXT INDEX t_idx(title) WITH ANALYZER alinlp,
    FULLTEXT INDEX b_idx(title) WITH ANALYZER ik,
    PRIMARY KEY (id)
)
DISTRIBUTE BY HASH(id);

Custom dictionaries

When you create full-text indexes and query data, documents or statements to be queried must be analyzed. Analysis results can affect the index creation and query results. In most scenarios, you can obtain ideal results by using the analysis-aliws plug-in. In some special scenarios, analysis results can cause different query results, which does not meet your actual business requirements. Custom dictionaries can be flexibly enriched to address this issue.

The full-text index feature of AnalyticDB for MySQL V3.0 supports custom dictionaries. You can add or delete data in a dictionary table, or create multiple dictionary tables. This way, you can use different dictionaries in different scenarios.

Create a custom dictionary table

The following syntax can be used to create a custom dictionary table in AnalyticDB for MySQL V3.0:
CREATE TABLE `ext_dict` (
  `value` varchar(255) NOT NULL COMMENT 'Value of an extension word or stop word',
  `type` varchar(4) NOT NULL DEFAULT 'main' COMMENT 'main specifies an extension word, and stop specifies a stop word. Stop words are not supported.',
  PRIMARY KEY (`value`,`type`)
) COMMENT='Your dictionary table'
FULLTEXT_DICT = 'Y';
Limits:
  1. A dictionary can contain only two fields: value and type.
  2. The value and type fields are of the VARCHAR type and cannot be empty.
  3. The primary key of a dictionary must contain the value and type fields.
Field description:
  1. value indicates the specific content of a term.
  2. type indicates the type of a term, such as extension word or stop word. Only extension words are valid. Terms of other types can be written but do not take effect.

AnalyticDB for MySQL limits the number of full-text dictionaries that can be created. You can create one full-text dictionary table in a logical database. A maximum of ten full-text dictionary tables can be created in a physical database.

Update a custom dictionary

A dictionary is a table on which read and write operations can be performed as standard tables. Dictionary tables have the following limits:
  1. Online data definition language (DDL) statements cannot be executed on dictionary tables.
  2. Dictionary tables cannot be updated or truncated.
  3. Indexes cannot be disabled.
  4. Full-text indexes, custom dictionaries, and custom analyzers are not supported.
  5. Dictionary tables cannot be deleted when they are being used by other indexes.
Insert a term:
INSERT INTO ext_dict (`value`) VALUES ('China');
Delete a term:
DELETE FROM ext_dict WHERE `value` = 'China' AND `type` = 'main';

A maximum of 10,000 records can be inserted into a dictionary. If you want to modify the maximum number of records to be inserted, submit a ticket.

After you insert or delete terms in a dictionary, the dictionary immediately takes effect. The latest terms in a dictionary are used to analyze new data written to a data table.

Use a custom dictionary

When you create a full-text index, you can specify a custom dictionary that can be used by the index. The following sample code shows how to create a full-text index:

ALTER TABLE `test` ADD FULLTEXT INDEX t_idx(`title`) WITH DICT [logical_schema.]table;

You can use the WITH DICT statement to specify the custom dictionary that you want to use. The specified custom dictionary must meet the following requirements:

  1. It is a custom dictionary table.
  2. You are granted the permission to select the dictionary.
If a single term in a custom dictionary has both Chinese characters and English letters such as China, you must specify the analysis-aliws plug-in. Example:
ALTER TABLE `test` ADD FULLTEXT INDEX t_idx(`title`) WITH DICT ext_dict WITH ANALYZER alinlp;

Full-text index policy

If full-text indexes are created in real time, high overheads are generated. AnalyticDB for MySQL V3.0 adopts an index creation policy that makes indexes visible within seconds for data that is written in real time. If you want to update the policy to meet your business requirements, contact AnalyticDB for MySQL technical support.