analysis-ik is an IK analysis plug-in of Alibaba Cloud Elasticsearch. This plug-in cannot be removed. In addition to open source features, the plug-in can dynamically load the dictionaries that are stored in Object Storage Service (OSS). The plug-in also allows you to use the standard or rolling update method to update dictionaries. This topic describes how to use the plug-in.
Background information
Update method | Application mode | Loading mode | Description |
---|---|---|---|
Standard update | This method updates the dictionaries on all nodes in an Elasticsearch cluster. It requires a restart of the cluster for the update to take effect. | The system sends an uploaded dictionary file to all nodes in an Elasticsearch cluster, modifies the IKAnalyzer.cfg.xml file, and then restarts the nodes to load the file. | You can use the standard update method to update the built-in IK main dictionary and
stopword list of the analysis-ik plug-in. In the Standard Update pane, you can view
the built-in main dictionary SYSTEM_MAIN.dic and the built-in stopword list SYSTEM_STOPWORD.dic .
|
Rolling update | The first time you upload a dictionary file, the dictionaries on all nodes in an Elasticsearch cluster are updated. The cluster needs to be restarted for the update to take effect. If the dictionary file that you upload has the same name as the existing dictionary file, the cluster does not need to be restarted. The dictionaries are directly loaded while the cluster is running. | If the content of a dictionary file changes, you can use this method to update the
dictionaries on all nodes in an Elasticsearch cluster. After you upload the latest
dictionary file, the nodes automatically load the file.
If the dictionary file list changes when you perform a rolling update, all nodes in the cluster need to reload dictionary configurations. For example, when you upload a new dictionary file or delete an existing dictionary file, the changes are synchronized to the IKAnalyzer.cfg.xml file. |
When you upload a dictionary file for the first time, the system modifies the IKAnalyzer.cfg.xml file. After the dictionaries are updated, the cluster must be restarted for the update to take effect. |
New dictionaries apply only to data that is inserted after a standard or rolling update. If you want to apply the new dictionaries to both the existing data and new data, you must reindex the existing data.
- If you want to update the built-in main dictionary, upload a dictionary file named SYSTEM_MAIN.dic. The new dictionary file automatically overwrites the existing file. For more information, see IK Analysis for Elasticsearch.
- If you want to update the built-in stopword list, upload a file named SYSTEM_STOPWORD.dic. The new file automatically overwrites the existing file. For more information, see IK Analysis for Elasticsearch and Configure a stopword list.
Prerequisites
Your Elasticsearch cluster is in a normal state. You can check the cluster status on the Basic Information page.
Perform a standard update for IK dictionaries
Perform a rolling update for IK dictionaries
Configure a stopword list
Alibaba Cloud Elasticsearch provides a built-in stopword list. The list contains the following predefined tokens: a, an, and, are, as, at, be, but, by, for, if, in, into, is, it, no, not, of, on, or, such, that, the, their, then, there, these, they, this, to, was, will, with.
You can perform the following steps to remove tokens from the stopword list: