how to quickly build an instance that supports the online multi-table join feature - OpenSearch

This topic describes how to quickly build an instance that supports the online multi-table join feature.

Prerequisites

Create an Alibaba Cloud account and complete identity verification.
When you log on to the Alibaba Cloud console for the first time, you are prompted to create an AccessKey.

Procedure

Purchase and create an application.
Configure the application by defining the application schema and index schema, and configuring the data source.
Test the search feature.

Create an application

On the Instance Management page of OpenSearch High-performance Search Edition, click Create Instance.
Select the instance specifications, such as the edition, billing method, region and zone, application name, cluster preference, storage capacity, and computing resources. Then, click Buy Now.
Set Edition to High-performance Search Edition.
- Produce Type: Subscription or Pay-as-you-go.
- Region and Zone: Select the desired region and zone.
- Application Name: Enter a custom name.
- Cluster Preference: Storage-optimized Dedicated Cluster
- You can keep the default values for Storage Capacity and Computing Resources, or specify the required capacity and resources.
Confirm the order, select the terms of service, and then click Activate Now.
After the instance is activated, it appears in the instance list in the console with a status of Pending.

Configure the application

On the instance list page in the console, find the new instance and click Configure in the Actions column.
Define the application schema. You can create an application schema in one of the following four ways.
- Create an application schema from a template. You can save a defined application schema as a template and use it to quickly create a new application.
- Create an application schema by uploading a document. You can upload an existing data file in JSON format. The system automatically parses the file and creates an initial application schema. You must then redefine the field types and other settings.
- Create an application schema from a data source. This method is suitable for data synchronization from sources such as RDS and MaxCompute. This feature quickly generates an initial application schema from the source table schema. This reduces manual configuration and lowers the chance of errors. For more information, see Data Source Configuration.
- Create an application schema manually. Use this method for quick testing. This topic uses this method to create two tables as an example.
For more information about field types, see Application schema in OpenSearch High-performance Search Edition.
Note:
Note
The total number of tables cannot exceed the system limit of eight.
Configure the index schema. You must configure the index schema for each table separately.
- For more information about how to configure the index schema, see Index schema.
- For more information about how to select an analysis method, see Text analyzers.
- Attribute field selection: Set a field as an attribute field to use it in a SELECT, WHERE, or ORDER BY clause.
Note
- Fields of the FLOAT, FLOAT_ARRAY, DOUBLE, or DOUBLE_ARRAY type cannot be set as index fields.
- Fields of the TEXT or SHORT_TEXT type cannot be set as attribute fields.
Configure the routing field.
An OpenSearch instance uses a distributed backend deployment. In multi-table scenarios, the data from different tables that needs to be joined must reside on the same machine, as shown in the following figure.
When building an index, the engine hashes records based on the configured routing field and stores records with the same hash value in the same column. During a query, the QRS worker sends requests to each column. Each column then performs an internal join based on the SQL query. Joins are not performed across columns. Finally, each column returns the join results to the QRS worker, which aggregates the results and returns them to the user.
Note
- By default, the system uses the primary key as the routing field.
- You can select only one field as the routing field.
- The values of the routing field must be globally unique.
- The routing field supports the INT and LITERAL data types.
- To join tables on a non-primary key, you must configure the join field as the routing field.
Configure the data source by selecting a data source type that is supported by OpenSearch High-performance Search Edition.
Click Add Data Source and configure the data source.
After the configuration is complete, click Completed.
On the instance details page, wait until the offline application status changes to Normal. You can then perform queries.

Test the search feature

After the offline application status changes to Normal and the application is published, you can test the search feature on the Search Test page.
Online multi-table joins currently support only SQL queries.
For more information about the SQL syntax, see SQL support.

Usage notes

Only Exclusive Storage-optimized clusters support online multi-table joins.
In multi-table join scenarios, only SQL queries are supported.
In multi-table join scenarios, custom analyzers are not supported.
In multi-table join scenarios, sort configurations are not supported. However, you can use ORDER BY.
In multi-table join scenarios, search result display settings are not supported.
Instances in Exclusive Storage-optimized clusters cannot be upgraded or downgraded to other specifications.