The aliyun-sql plug-in is developed based on Apache Calcite and deployed on a server. It is used to parse SQL queries. After you install this plug-in on your Alibaba Cloud Elasticsearch cluster, you can execute SQL statements to query data in the cluster the same way as in common databases. This greatly reduces the training and usage costs of Elasticsearch.
Background information
The aliyun-sql plug-in offers more features than open source SQL plug-ins. The following table compares the aliyun-sql plug-in with open source SQL plug-ins.
SQL plug-in | SQL parser | Paged query | Join | Nested | Common function | Case function | Extended UDF | Optimization of execution plans |
x-pack-sql (6.x) | Antlr | Supported. | Not supported. | Supported. The syntax is a.b. | Supports abundant functions. | Not supported. | Not supported. | Provides a large number of rules for the optimization of execution plans. |
opendistro-for-elasticsearch | Druid | Not supported. The maximum number of data entries that can be queried is determined by the | Supported. | Supported. The syntax is | Supports a few functions. | Not supported. | Not supported. | Provides a few rules for the optimization of execution plans. |
aliyun-sql | Javacc | Supported. | Supported. The plug-in provides the truncation feature. This feature allows you to dynamically configure the number of data entries that you can query from a single table. For more information, see Syntax overview. | Supported. The syntax is a.b. | Supports abundant functions. For more information, see Other functions and expressions. | Supported. | Supported. For more information, see UDFs. | Provides a large number of rules for the optimization of execution plans and uses Calcite to optimize execution plans. |
The CASE statement in the preceding table refers to the CASE WHEN THEN ELSE syntax.
Prerequisites
You have completed the following operations:
Create an Alibaba Cloud Elasticsearch cluster whose version is V6.7.0 or later but is earlier than V7.10.0.
For more information, see Create an Alibaba Cloud Elasticsearch cluster.
Install the aliyun-sql plug-in.
The plug-in is installed by default. You can check whether the plug-in is installed on the plug-in configuration page of the cluster. If the plug-in is not installed, follow the instructions provided in Install and remove a built-in plug-in to install the plug-in.
Limits
The aliyun-sql plug-in is available only for Alibaba Cloud Elasticsearch clusters whose versions are V6.7.0 or later but are earlier than V7.10.0.
Precautions
Before you use the plug-in, make sure that the aliyun.sql.enabled parameter is set to true for your Elasticsearch cluster. You can configure the parameter in the Kibana console. For more information, see Log on to the Kibana console.
You can manually remove the plug-in. Before you remove the plug-in, run the following command in the Kibana console to disable it. To disable the plug-in, set aliyun.sql.enabled to null. The following code shows the configuration:
PUT _cluster/settings { "persistent": { "aliyun.sql.enabled": null } }
Plug-in removal restarts your cluster. If you do not disable the plug-in before you remove it, your cluster remains stuck in the restart. In this case, you must run the following command to clear the archiving configurations and resume the restart:
PUT _cluster/settings { "persistent": { "archived.aliyun.sql.enabled": null } }
Syntax overview
The aliyun-sql plug-in uses the syntax of MySQL 5.0 and supports a wide range of functions and expressions. For more information, see Other functions and expressions.
Basic query
SELECT [DISTINCT] (* | expression) [[AS] alias] [, ...] FROM table_name [WHERE condition] [GROUP BY expression [, ...] [HAVING condition]] [ORDER BY expression [ ASC | DESC ] [, ...]] [LIMIT [offset, ] size]
Join query
SELECT expression FROM table_name JOIN table_name ON expression [WHERE condition]
ImportantWhen you perform a join query, Alibaba Cloud Elasticsearch limits the maximum number of data entries that you can query from a single table. The default number is 10,000. You can specify the maximum number of data entries that you can query from a table by configuring the max.join.size parameter. For example, you can run the following command in the Kibana console to set the maximum number to 20,000:
PUT /_cluster/settings { "transient": { "max.join.size": 20000 } }
The join query you performed is an inner join query. Actually, the aliyun-sql plug-in performs a merge join for the query. When you perform a join query, make sure that the field values in the tables you want to join change with the document IDs of Elasticsearch. Join queries can be performed only for fields of a numeric data type.
Procedure
- Log on to the Kibana console of your Elasticsearch cluster and go to the homepage of the Kibana console as prompted. For more information about how to log on to the Kibana console, see Log on to the Kibana console.Note In this example, an Elasticsearch V6.7.0 cluster is used. Operations on clusters of other versions may differ. The actual operations in the console prevail.
- In the left-side navigation pane of the page that appears, click Dev Tools.
On the Console tab, run the following command to enable the plug-in:
PUT _cluster/settings { "transient": { "aliyun.sql.enabled": true } }
Write data.
NoteThe aliyun-sql plug-in supports only query requests does not support write requests. Therefore, the following code uses a bulk request to write data.
Data of student information
PUT stuinfo/_doc/_bulk?refresh {"index":{"_id":"1"}} {"id":572553,"name":"xiaoming","age":"22","addr":"addr1"} {"index":{"_id":"2"}} {"id":572554,"name":"xiaowang","age":"23","addr":"addr2"} {"index":{"_id":"3"}} {"id":572555,"name":"xiaoliu","age":"21","addr":"addr3"}
Data of student rankings
PUT sturank/_doc/_bulk?refresh {"index":{"_id":"1"}} {"id":572553,"score":"90","sorder":"5"} {"index":{"_id":"2"}} {"id":572554,"score":"92","sorder":"3"} {"index":{"_id":"3"}} {"id":572555,"score":"86","sorder":"10"}
Execute an SQL statement.
Perform a join query to query the name and ranking of a student.
POST /_alisql { "query":"select stuinfo.name,sturank.sorder from stuinfo join sturank on stuinfo.id=sturank.id" }
If the statement is successfully executed, the aliyun-sql plug-in returns table information. The columns field contains column names and data types. The rows field contains row data.
{ "columns" : [ { "name" : "name", "type" : "text" }, { "name" : "sorder", "type" : "text" } ], "rows" : [ [ "xiaoming", "5" ], [ "xiaowang", "3" ], [ "xiaoliu", "10" ] ] }