analysis-aliws is a built-in plug-in of Alibaba Cloud Elasticsearch. This plug-in integrates an analyzer and a tokenizer into Elasticsearch to search and analyze documents. The plug-in allows you to upload a tailored dictionary file to it. After the upload, the system automatically performs a rolling update for your cluster and replaces the default dictionary file. This topic describes how to use the analysis-aliws plug-in.
Prerequisites
If it is not installed, install it. Make sure that your Elasticsearch cluster offers at least 4 GiB of memory. If your cluster is in a production environment, it must offer at least 8 GiB of memory. For more information about how to install the analysis-aliws plug-in, see Install and remove a built-in plug-in.
- Alibaba Cloud Elasticsearch V5.0 clusters do not support the analysis-aliws plug-in.
- If the memory size of your cluster does not meet the preceding requirements, upgrade the configuration of your cluster. For more information, see Upgrade the configuration of a cluster.
Background information
- Analyzer: aliws, which does not return function words, function phrases, or symbols
- Tokenizer: aliws_tokenizer
You can use the analyzer and tokenizer to search for documents. You can also upload a tailored dictionary file to the plug-in. For more information, see Search for a document and Configure a dictionary.
Search for a document
Configure a dictionary
The analysis-aliws plug-in allows you to upload a tailored dictionary file to it to replace the default dictionary file. After you upload a tailored dictionary file, all nodes in your Elasticsearch cluster automatically load the file. In this case, the system does not restart the cluster.
Test the analyzer
Run the following command to test the aliws analyzer:
GET _analyze
{
"text": "I like go to school.",
"analyzer": "aliws"
}
{
"tokens" : [
{
"token" : "i",
"start_offset" : 0,
"end_offset" : 1,
"type" : "word",
"position" : 0
},
{
"token" : "like",
"start_offset" : 2,
"end_offset" : 6,
"type" : "word",
"position" : 2
},
{
"token" : "go",
"start_offset" : 7,
"end_offset" : 9,
"type" : "word",
"position" : 4
},
{
"token" : "school",
"start_offset" : 13,
"end_offset" : 19,
"type" : "word",
"position" : 8
}
]
}
Test the tokenizer
Run the following command to test the aliws_tokenizer tokenizer:
GET _analyze
{
"text": "I like go to school.",
"tokenizer": "aliws_tokenizer"
}
{
"tokens" : [
{
"token" : "I",
"start_offset" : 0,
"end_offset" : 1,
"type" : "word",
"position" : 0
},
{
"token" : " ",
"start_offset" : 1,
"end_offset" : 2,
"type" : "word",
"position" : 1
},
{
"token" : "like",
"start_offset" : 2,
"end_offset" : 6,
"type" : "word",
"position" : 2
},
{
"token" : " ",
"start_offset" : 6,
"end_offset" : 7,
"type" : "word",
"position" : 3
},
{
"token" : "go",
"start_offset" : 7,
"end_offset" : 9,
"type" : "word",
"position" : 4
},
{
"token" : " ",
"start_offset" : 9,
"end_offset" : 10,
"type" : "word",
"position" : 5
},
{
"token" : "to",
"start_offset" : 10,
"end_offset" : 12,
"type" : "word",
"position" : 6
},
{
"token" : " ",
"start_offset" : 12,
"end_offset" : 13,
"type" : "word",
"position" : 7
},
{
"token" : "school",
"start_offset" : 13,
"end_offset" : 19,
"type" : "word",
"position" : 8
},
{
"token" : ".",
"start_offset" : 19,
"end_offset" : 20,
"type" : "word",
"position" : 9
}
]
}