All Products
Search
Document Center

OpenSearch:What are the differences in syntax between SQL statements and search clauses in OpenSearch?

Last Updated:Feb 09, 2023

OpenSearch focuses on searches and provides search clauses to implement SQL features. Search clauses in OpenSearch and SQL statements use different syntax but can achieve the same search performance. The following sections describe how to implement the major SQL features in OpenSearch in detail.

Comparison of common clauses

Most developers are familiar with SQL statements. This section describes how OpenSearch implements the following SQL features based on the implementation logic of SQL statements:

(1) FROM (2) ON (3) JOIN (4) WHERE (5) GROUP BY (6) WITH {CUBE | ROLLUP} (7) HAVING (8) SELECT (9) DISTINCT (10) ORDER BY (11) LIMIT.

1. FROM:

    • In SQL statements, FROM is followed by a table name or dataset name. FROM clauses are used to query data portions, irrespective of complex nested structures.

    • OpenSearch uses an application as a table unit, and an application corresponds to a type of business logic. Therefore, when you perform searches, you do not need to specify the table name. You need to only specify the host, application name, AccessKey ID, and AccessKey secret in the SDK, such as OpenSearch SDK for Java. For more information, see Demo code for implementing search features.

2. ON and JOIN:

    • In SQL statements, JOIN clauses are used to join two tables or datasets. ON clauses are used to limit the join of tables. In other words, ON clauses specify the conditions that must be met before two tables can be joined.

    • In OpenSearch, JOIN clauses are configured when you define application schemas in the process of creating applications. After JOIN clauses are configured, the data that is imported to OpenSearch is joined to create a wide table. For more information about the implementation logic, see How is data synchronized in OpenSearch? In OpenSearch, ON clauses can be used to filter out table entries that do not meet the requirements when you configure data sources.

3. WHERE:

    • In SQL statements, WHERE clauses are similar to filter conditions, which are used to retain the data that meets the requirements.

    • In OpenSearch, you can use the query or filter clause to filter data. The query clause uses inverted indexes to perform searches, and can achieve high search efficiency. However, the query clause is not as flexible as the filter clause. You cannot use expressions in the query clause to perform calculations. The filter clause uses forward indexes to perform searches. The search efficiency of using the filter clause is lower than that of using the query clause. However, you can use expressions and functions in the filter clause to perform calculations. Therefore, the filter clause can adapt to more diversified scenarios.

4. GROUP BY:

    • In SQL statements, GROUP BY clauses are widely used and complex. In short, GROUP BY clauses are used to group and collect statistics based on functions such as COUNT, SUM, MAX, MIN, and AVG.

    • In OpenSearch, you can use the aggregate clause to achieve the same effect as the GROUP BY clause.

5. WITH and HAVING: OpenSearch does not support WITH and HAVING clauses. You can implement these clauses based on your requirements after return results are obtained.

6. SELECT:

    • In SQL statements, SELECT clauses are used to specify the fields to be returned in each data entry.

    • In OpenSearch, you can use the OpenSearch console or set the default display fields and the fetch_fields parameter to specify the fields to be returned in each data entry.

7. DISTINCT:

    • In SQL statements, DISTINCT clauses are used to delete duplicate values by comparing field values.

    • In OpenSearch, you can use the distinct clause to delete duplicate values. In addition, you can use the distinct clause to scatter fields, which is similar to the concept of grouping data to obtain top N results.

8. ORDER BY:

    • In SQL statements, ORDER BY clauses are used to sort results in ascending or descending order by field or expression.

    • In OpenSearch, you can use the sort clause to sort results by field or document score. In addition, you can use an algorithm model to intervene with documents and then sort the documents based on the scores and field values of the documents. For more information about sort policies, see Sort expressions.

9. LIMIT:

    • In SQL statements, LIMIT clauses are used to implement paging. Two parameters are involved: start and hit. The start parameter specifies the specific entry from which the paging starts. The hit parameter specifies the specific number of entries to be returned.

    • In OpenSearch, you can use the config clause to set the start and hit parameters to implement paging.

Comparison of other clauses

1. LIKE and NOT LIKE:

    • In SQL statements, LIKE and NOT LIKE clauses are used to implement fuzzy match. For example, you can use like '%Beijing%' to retrieve the fields whose value contains 'Beijing'.

    • In OpenSearch, you can use the query clause to retrieve documents. The implementation logic is to analyze the field values and then return the values that match the items that are obtained after analysis. OpenSearch also provides various powerful analyzers, which can be used to achieve the same effect as LIKE clauses. In addition, these analyzers can work more efficiently than LIKE clauses. You can use the logical operator ANDNOT in the query clause to achieve the same effect as NOT LIKE clauses.

2. IN:

    • In SQL statements, you can use the IN operator in a WHERE clause to specify multiple values. Example: WHERE column_name IN (value1, value2, ...);.

    • In OpenSearch, you can use in/not functions in the filter clause to achieve the same effect as the IN operator in SQL statements.