Application schema

Data that is pushed to an Industry Algorithm Edition instance is first stored in offline data tables. To simplify the data push process, you can define multiple tables and use data processing plugins. You must specify the foreign key fields for the tables. After data processing is complete, the tables are joined to form an index table. This index table defines the search attributes that the engine uses to build indexes and run queries.

Data table fields

Data tables are used for data import. The requirements for field types vary based on the data processing plugin. For more information about the value ranges of fields, see the "Limits on fields" section in Limits. If a value is outside the specified range, it overflows or is truncated. Therefore, ensure that you select the correct field type.

Type	Description
INT	64-bit integer.
INT_ARRAY	64-bit integer array.
FLOAT	Floating-point number.
FLOAT_ARRAY	Floating-point number array.
DOUBLE	Floating-point number.
DOUBLE_ARRAY	Floating-point number array.
LITERAL	String literal. Supports only exact match.
LITERAL_ARRAY	String literal array. A single element supports only exact match.
SHORT_TEXT	Short text. The length must be within 100 bytes. Supports several tokenization methods.
TEXT	Long text. Supports several tokenization methods.
TIMESTAMP	64-bit unsigned integer. Timestamp data.
GEO_POINT	String literal. A longitude and latitude field in the format: "longitude latitude".
OBJECT	Represents a JSON object array. It uses flattened storage and loses object boundaries.
NESTED	Represents a JSON object array. It uses independent storage for primary and secondary documents and preserves object integrity.

Notes on reserved words:

The following reserved words cannot be used as field names: ['service_id', 'ops_app_name', 'inter_timestamp', 'index_name', 'pk', 'ops_version', 'ha_reserved_timestamp', 'summary'].

Notes on the ARRAY type:

If you create an application field with the ARRAY type, you can map the field to a string-type field, such as varchar or string, from the data source during field mapping. You can then use a data processing plugin to parse the data source field. For more information, see Data processing plugins.
If you push a field of an ARRAY type using an API or SDK, push it as an array, not as a string. For example: String[] literal_array = {"Alibaba Cloud","OpenSearch"};

Notes on the timestamp field:

You can map fields of the INT and TIMESTAMP types to a datetime or timestamp field in a data source. The values are automatically converted to milliseconds. During a search, you can use a range query to filter and retrieve results based on a time interval.

Supported data source field types

Data source	Supported field types
RDS	TINYINT,SMALLINT,INTEGER,BIGINT,FLOAT,REAL,DOUBLE,NUMERIC,DECIMAL,TIME,DATE,TIMESTAMP,VARCHAR
PolarDB	TINYINT,SMALLINT,INTEGER,BIGINT,FLOAT,REAL,DOUBLE,NUMERIC,DECIMAL,TIME,DATE,TIMESTAMP,VARCHAR
MaxCompute (formerly known as ODPS)	BIGINT,DOUBLE,BOOLEAN,DATETIME,STRING,DECIMAL,MAP,ARRAY,TINYINT,SMALLINT,INT,FLOAT,CHAR,VARCHAR,DATE,TIMESTAMP,BINARY,INTERVAL_DAY_TIME,INTERVAL_YEAR_MONTH,STRUCT

Mappings between the field types of Industry Algorithm Edition tables and database tables

Industry Algorithm Edition table	RDS table	PolarDB table	MaxCompute table
INT	BIGINT,TINYINT,SMALLINT,INTEGER	BIGINT,TINYINT,SMALLINT,INTEGER	BIGINT,TINYINT,SMALLINT,INT
INT_ARRAY	String types, such as VARCHAR and STRING. You must use the MultiValueSpliter data processing plugin to transform the data.	String types, such as VARCHAR and STRING. You must use the MultiValueSpliter data processing plugin to transform the data.	String types, such as VARCHAR and STRING. You must use the MultiValueSpliter data processing plugin to transform the data.
FLOAT	FLOAT,NUMERIC,DECIMAL	FLOAT,NUMERIC,DECIMAL	FLOAT,DECIMAL
FLOAT_ARRAY	String types, such as VARCHAR and STRING. You must use the MultiValueSpliter data processing plugin to transform the data.	String types, such as VARCHAR and STRING. You must use the MultiValueSpliter data processing plugin to transform the data.	String types, such as VARCHAR and STRING. You must use the MultiValueSpliter data processing plugin to transform the data.
DOUBLE	DOUBLE,NUMERIC,DECIMAL	DOUBLE,NUMERIC,DECIMAL	DOUBLE,DECIMAL
DOUBLE_ARRAY	String types, such as VARCHAR. You must use the MultiValueSpliter data processing plugin to transform the data.	String types, such as VARCHAR. You must use the MultiValueSpliter data processing plugin to transform the data.	String types, such as VARCHAR and STRING. You must use the MultiValueSpliter data processing plugin to transform the data.
LITERAL	String types, such as VARCHAR.	String types, such as VARCHAR.	String types, such as VARCHAR and STRING.
LITERAL_ARRAY	String types, such as VARCHAR. You must use the MultiValueSpliter data processing plugin to transform the data.	String types, such as VARCHAR. You must use the MultiValueSpliter data processing plugin to transform the data.	String types, such as VARCHAR and STRING. You must use the MultiValueSpliter data processing plugin to transform the data.
SHORT_TEXT	String types, such as VARCHAR.	String types, such as VARCHAR.	String types, such as VARCHAR and STRING.
TEXT	String types, such as VARCHAR.	String types, such as VARCHAR.	String types, such as VARCHAR and STRING.
TIMESTAMP	datetime or timestamp type.	datetime or timestamp type.	datetime or timestamp type.
GEO_POINT	String types, such as VARCHAR.	String types, such as VARCHAR.	String types, such as VARCHAR and STRING, in the `lon lat` format. `lon` represents the longitude and `lat` represents the latitude. Both values must be of the double type and are separated by a space. The value range for `lon` is [-180, 180], and the value range for `lat` is [-90, 90].

Note:

If a data source field is of the FLOAT or DOUBLE type, change its type to DECIMAL. Otherwise, precision issues may occur.

Methods to create an application schema

Industry Algorithm Edition provides the following four methods for creating an application schema (the table schema for Industry Algorithm Edition):

Create from a data source (RDS, MaxCompute, or PolarDB).
Create manually (see the Configure table joins section below).
Create from a template.
Create by uploading a document.

Configure table joins

This section describes how to configure table joins by manually creating an application schema. This example uses two tables: main (the primary table) and test_tb_1 (the secondary table).

Log on to the console and click Configure:
Select the primary table and set its primary key.
Set the primary key of the secondary table.
Set the association between the primary and secondary tables. This is configured in the primary table.

Note

For more information about the primary-secondary table data associations supported by Industry Algorithm Edition, see Create table joins.
Only fields of the int or literal type can be used as foreign key fields.
When you join a primary table and a secondary table, the join fields must have the same data type. For example, if one field is an int type, the other must also be an int type. If one field is a literal type, the other must also be a literal type.
When you join a secondary table to a primary table, you must use the primary key of the secondary table to join with a field in the primary table. You cannot use a non-primary key field from the secondary table for the join.