Schema of tables for Industry Algorithm Edition - OpenSearch

Application schema

The data that you upload to an instance of Industry Algorithm Edition is first stored in offline data tables. To facilitate data upload, Industry Algorithm Edition allows you to create multiple data tables based on your business requirements and provides data processing plug-ins. If you create multiple data tables, you must specify the fields that are associated with the tables. After the data in multiple data tables is processed, the tables are joined to form an index table. The index table defines search attributes and can be used by the search engine to build indexes and search for data.

Fields in a data table

You can use data tables to import data to OpenSearch. Each data processing plug-in can process only specific types of fields. For more information about the limits on field values, see the "Limits on fields" section of the Limits topic. If a field value is not in the specified value range, an overflow error occurs or the value is truncated. You must select correct field types.

Field type	Description
INT	64-bit integer.
INT_ARRAY	64-bit integer array.
FLOAT	Floating-point number.
FLOAT_ARRAY	Floating-point number array.
DOUBLE	Floating-point number.
DOUBLE_ARRAY	Floating-point number array.
LITERAL	String constant. A string constant supports only exact match.
LITERAL_ARRAY	String constant array. A single element in a string constant array supports only exact match.
SHORT_TEXT	Short text. A value of this type cannot exceed 100 bytes in length. This type of field supports multiple analysis methods.
TEXT	Long text. This type of field supports multiple analysis methods.
TIMESTAMP	64-bit unsigned integer. This type of field indicates a timestamp.
GEO_POINT	String constant. This type of field indicates a pair of latitude and longitude in the "Latitude value Longitude value" format.

Usage notes on reserved fields:

The following field names are reserved and cannot be used when you specify the filed names: 'service_id', 'ops_app_name', 'inter_timestamp', 'index_name', 'pk', 'ops_version', 'ha_reserved_timestamp', and 'summary'.

Usage notes on fields of array types:

After you create a field of an array type in an application, you can map the field to a field of the STRING type such as VARCHAR in a data source. In addition, you can use a data processing plug-in to process the field. For more information, see Use data processing plug-ins.
If you use the OpenSearch Industry Algorithm Edition API or SDKs to upload a field of an array type, upload the field as an array instead of a string. Example: String[] literal_array = {"Alibaba Cloud","OpenSearch"};

Usage notes on fields of timestamp types:

Fields of the INT and TIMESTAMP types can be mapped to a field of the DATETIME or TIMESTAMP type in a data source. The field values are automatically converted to the number of milliseconds. You can use the range function to retrieve search results by time range. For more information, see Searches by range.

Supported field types in a data source

Data source	Supported field type
RDS	TINYINT,SMALLINT,INTEGER,BIGINT,FLOAT,REAL,DOUBLE,NUMERIC,DECIMAL,TIME,DATE,TIMESTAMP,VARCHAR
PolarDB	TINYINT,SMALLINT,INTEGER,BIGINT,FLOAT,REAL,DOUBLE,NUMERIC,DECIMAL,TIME,DATE,TIMESTAMP,VARCHAR
MaxCompute	BIGINT,DOUBLE,BOOLEAN,DATETIME,STRING,DECIMAL,MAP,ARRAY,TINYINT,SMALLINT,INT,FLOAT,CHAR,VARCHAR,DATE,TIMESTAMP,BINARY,INTERVAL_DAY_TIME,INTERVAL_YEAR_MONTH,STRUCT

Mappings between field types of tables for Industry Algorithm Edition and those of tables in a database

Table for Industry Algorithm Edition	RDS table	PolarDB table	MaxCompute table
INT	BIGINT,TINYINT,SMALLINT,INTEGER	BIGINT,TINYINT,SMALLINT,INTEGER	BIGINT,TINYINT,SMALLINT,INT
INT_ARRAY	STRING types such as VARCHAR. This field type must be converted by using the data processing plug-in MultiValueSpliter.	STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.	STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.
FLOAT	FLOAT,NUMERIC,DECIMAL	FLOAT,NUMERIC,DECIMAL	FLOAT,DECIMAL
FLOAT_ARRAY	STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.	STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.	STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.
DOUBLE	DOUBLE,NUMERIC,DECIMAL	DOUBLE,NUMERIC,DECIMAL	DOUBLE,DECIMAL
DOUBLE_ARRAY	STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.	STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.	STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.
LITERAL	STRING types such as VARCHAR	STRING types such as VARCHAR	STRING types such as VARCHAR
LITERAL_ARRAY	STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.	STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.	STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.
SHORT_TEXT	STRING types such as VARCHAR	STRING types such as VARCHAR	STRING types such as VARCHAR
TEXT	STRING types such as VARCHAR	STRING types such as VARCHAR	STRING types such as VARCHAR
TIMESTAMP	DATETIME or TIMESTAMP	DATETIME or TIMESTAMP	DATETIME or TIMESTAMP
GEO_POINT	STRING types such as VARCHAR	STRING types such as VARCHAR	STRING types such as VARCHAR in the format of "lon lat" lon indicates the longitude and lat indicates the latitude. Both longitude and latitude values must be of the DOUBLE type and be separated by a space. Range of the lon parameter: [-180, 180] and range of the lat parameter: [-90,90].

Note:

If fields in a data source are of the FLOAT or DOUBLE type, we recommend that you change the field types to DECIMAL. Otherwise, this may compromise the precision.

Create an application schema

Industry Algorithm Edition provides the following methods to create an application schema, namely, table schema for Industry Algorithm Edition:

Create an application schema by importing a table schema from an ApsaraDB RDS for MySQL, MaxCompute, or PolarDB data source.
Manually create an application schema. For more information, see Join multiple tables.
Create an application schema by using a template.
Create an application schema by uploading a data file.

Join multiple tables

This section describes how to join multiple tables to manually create an application schema. In this example, the primary table main and the secondary table test_tb_1 are associated.

1. Log on to the OpenSearch console, find the desired application instance, and click Configure in the Actions column.

2. On the page that appears, specify the main table as the primary table and specify a field as the primary key.

3. Specify a field as the primary key for the secondary table.

4. In the External Tables column of the primary table, select the secondary table to join the primary table and the secondary table.

Note:

For more information about the association between primary and secondary tables supported by Industry Algorithm Edition, see .
Foreign key fields must be of the INT or LITERAL type.
When you join the primary table to one or more secondary tables, the fields that are used for the join operation must be of the same type in the primary and secondary tables. For example, if the field used for the join operation in the primary table is of the INT type, the fields used for the join operation in the secondary tables must also be of the INT type. If the field used for the join operation in the primary table is of the LITERAL type, the fields used for the join operation in the secondary tables must also be of the LITERAL type.
When you join a secondary table to the primary table, you must use the primary key of the secondary table to join the tables.