All Products
Search
Document Center

OpenSearch:Schema of tables for Industry Algorithm Edition

Last Updated:Feb 02, 2023

Application schema

The data that you upload to an instance of Industry Algorithm Edition is first stored in offline data tables. To facilitate data upload, Industry Algorithm Edition allows you to create multiple data tables based on your business requirements and provides data processing plug-ins. If you create multiple data tables, you must specify the fields that are associated with the tables. After the data in multiple data tables is processed, the tables are joined to form an index table. The index table defines search attributes and can be used by the search engine to build indexes and search for data.

Fields in a data table

You can use data tables to import data to OpenSearch. Each data processing plug-in can process only specific types of fields. For more information about the limits on field values, see the "Limits on fields" section of the Limits topic. If a field value is not in the specified value range, an overflow error occurs or the value is truncated. You must select correct field types.

Field type

Description

INT

64-bit integer.

INT_ARRAY

64-bit integer array.

FLOAT

Floating-point number.

FLOAT_ARRAY

Floating-point number array.

DOUBLE

Floating-point number.

DOUBLE_ARRAY

Floating-point number array.

LITERAL

String constant. A string constant supports only exact match.

LITERAL_ARRAY

String constant array. A single element in a string constant array supports only exact match.

SHORT_TEXT

Short text. A value of this type cannot exceed 100 bytes in length. This type of field supports multiple analysis methods.

TEXT

Long text. This type of field supports multiple analysis methods.

TIMESTAMP

64-bit unsigned integer. This type of field indicates a timestamp.

GEO_POINT

String constant. This type of field indicates a pair of latitude and longitude in the "Latitude value Longitude value" format.

Usage notes on reserved fields:

  • The following field names are reserved and cannot be used when you specify the filed names: 'service_id', 'ops_app_name', 'inter_timestamp', 'index_name', 'pk', 'ops_version', 'ha_reserved_timestamp', and 'summary'.

Usage notes on fields of array types:

  • After you create a field of an array type in an application, you can map the field to a field of the STRING type such as VARCHAR in a data source. In addition, you can use a data processing plug-in to process the field. For more information, see Use data processing plug-ins.

  • If you use the OpenSearch Industry Algorithm Edition API or SDKs to upload a field of an array type, upload the field as an array instead of a string. Example: String[] literal_array = {"Alibaba Cloud","OpenSearch"};

Usage notes on fields of timestamp types:

  • Fields of the INT and TIMESTAMP types can be mapped to a field of the DATETIME or TIMESTAMP type in a data source. The field values are automatically converted to the number of milliseconds. You can use the range function to retrieve search results by time range. For more information, see Searches by range.

Supported field types in a data source

Data source

Supported field type

RDS

TINYINT,SMALLINT,INTEGER,BIGINT,FLOAT,REAL,DOUBLE,NUMERIC,DECIMAL,TIME,DATE,TIMESTAMP,VARCHAR

PolarDB

TINYINT,SMALLINT,INTEGER,BIGINT,FLOAT,REAL,DOUBLE,NUMERIC,DECIMAL,TIME,DATE,TIMESTAMP,VARCHAR

MaxCompute

BIGINT,DOUBLE,BOOLEAN,DATETIME,STRING,DECIMAL,MAP,ARRAY,TINYINT,SMALLINT,INT,FLOAT,CHAR,VARCHAR,DATE,TIMESTAMP,BINARY,INTERVAL_DAY_TIME,INTERVAL_YEAR_MONTH,STRUCT

Mappings between field types of tables for Industry Algorithm Edition and those of tables in a database

Table for Industry Algorithm Edition

RDS table

PolarDB table

MaxCompute table

INT

BIGINT,TINYINT,SMALLINT,INTEGER

BIGINT,TINYINT,SMALLINT,INTEGER

BIGINT,TINYINT,SMALLINT,INT

INT_ARRAY

STRING types such as VARCHAR. This field type must be converted by using the data processing plug-in MultiValueSpliter.

STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.

STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.

FLOAT

FLOAT,NUMERIC,DECIMAL

FLOAT,NUMERIC,DECIMAL

FLOAT,DECIMAL

FLOAT_ARRAY

STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.

STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.

STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.

DOUBLE

DOUBLE,NUMERIC,DECIMAL

DOUBLE,NUMERIC,DECIMAL

DOUBLE,DECIMAL

DOUBLE_ARRAY

STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.

STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.

STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.

LITERAL

STRING types such as VARCHAR

STRING types such as VARCHAR

STRING types such as VARCHAR

LITERAL_ARRAY

STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.

STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.

STRING types such as VARCHAR. This field type must be converted by the data processing plug-in MultiValueSpliter.

SHORT_TEXT

STRING types such as VARCHAR

STRING types such as VARCHAR

STRING types such as VARCHAR

TEXT

STRING types such as VARCHAR

STRING types such as VARCHAR

STRING types such as VARCHAR

TIMESTAMP

DATETIME or TIMESTAMP

DATETIME or TIMESTAMP

DATETIME or TIMESTAMP

GEO_POINT

STRING types such as VARCHAR

STRING types such as VARCHAR

STRING types such as VARCHAR in the format of "lon lat" lon indicates the longitude and lat indicates the latitude. Both longitude and latitude values must be of the DOUBLE type and be separated by a space. Range of the lon parameter: [-180, 180] and range of the lat parameter: [-90,90].

Note:

  • If fields in a data source are of the FLOAT or DOUBLE type, we recommend that you change the field types to DECIMAL. Otherwise, this may compromise the precision.

Create an application schema

Industry Algorithm Edition provides the following methods to create an application schema, namely, table schema for Industry Algorithm Edition:

  1. Create an application schema by importing a table schema from an ApsaraDB RDS for MySQL, MaxCompute, or PolarDB data source.

  2. Manually create an application schema. For more information, see Join multiple tables.

  3. Create an application schema by using a template.

  4. Create an application schema by uploading a data file.

Join multiple tables

This section describes how to join multiple tables to manually create an application schema. In this example, the primary table main and the secondary table test_tb_1 are associated.

1. Log on to the OpenSearch console, find the desired application instance, and click Configure in the Actions column. image

2. On the page that appears, specify the main table as the primary table and specify a field as the primary key.

image

3. Specify a field as the primary key for the secondary table.

image

4. In the External Tables column of the primary table, select the secondary table to join the primary table and the secondary table.

image

Note:

  • For more information about the association between primary and secondary tables supported by Industry Algorithm Edition, see .

  • Foreign key fields must be of the INT or LITERAL type.

  • When you join the primary table to one or more secondary tables, the fields that are used for the join operation must be of the same type in the primary and secondary tables. For example, if the field used for the join operation in the primary table is of the INT type, the fields used for the join operation in the secondary tables must also be of the INT type. If the field used for the join operation in the primary table is of the LITERAL type, the fields used for the join operation in the secondary tables must also be of the LITERAL type.

  • When you join a secondary table to the primary table, you must use the primary key of the secondary table to join the tables.