This topic describes how Open Search Writer works, its features, data types, and parameters, and how to configure it by using the code editor.

Notice Open Search Writer supports only exclusive resource groups for Data Integration, but not the default resource group or custom resource groups. For more information, see Use exclusive resource groups for data integration, Use the default resource group, and Add a custom resource group.

How it works

Open Search Writer allows you to insert data to or update data in Open Search. Open Search Writer is designed for developers to import data to Open Search so that the data can be searched.

Specifically, Open Search Writer uses the search API that is provided by Open Search to import data.

Note
  • Open Search V3 uses internal dependent databases, with POM of com.aliyun.opensearch aliyun-sdk-opensearch 2.1.3.
  • To use Open Search Writer, you must install JDK 1.6-32 or later. You can run the java-version command to view the JDK version.
  • A sync node that is run on the default resource group may fail to connect to Open Search that is deployed in a virtual private cloud (VPC).

Features

The columns in Open Search are unordered. Open Search Writer writes data in strict accordance with the order of the specified columns. If the number of specified columns is less than that in Open Search, redundant columns in Open Search are set to the default value or null.

Assume that an Open Search table contains columns a, b, and c, and you only need to write data to columns b and c. You can set the column parameter to ["c","b"]. In this case, Open Search Writer imports the first and second columns of the source data that is obtained from a reader to columns c and b in the Open Search table. Column a in the Open Search table is set to the default value or null.

Additional instructions:
  • Handling of column configuration errors

    To avoid losing the data of redundant columns and ensure high data reliability, Open Search Writer returns an error message if the number of columns to be written is more than that in the destination Open Search table. For example, if an Open Search table contains columns a, b, and c, Open Search Writer returns an error if more than three columns are to be written to the table.

  • Table configuration

    Open Search Writer can write data to only one table at a time.

  • Node rerunning

    After a node is rerun, data is overwritten based on IDs. Therefore, the data written to Open Search must contain an ID column. An ID is a unique identifier of a row in Open Search. The existing data with the same ID as the new data will be overwritten.

  • Node rerunning

    After a node is rerun, data is overwritten based on IDs.

Data types

Open Search Writer supports most Open Search data types. Make sure that your data types are supported.

The following table describes the data types that Open Search Writer supports.
Category Open Search data type
Integer INT
Floating point DOUBLE and FLOAT
String TEXT, LITERAL, and SHORT_TEXT
Date and time INT
Boolean LITERAL

Parameters

Parameter Description Required Default value
accessId The AccessKey ID of the account that you can use to connect to the Open Search project. Yes N/A
accessKey The AccessKey secret of the account that you can use to connect to the Open Search project. Yes N/A
host

The endpoint of Open Search. You can view the endpoint in the Alibaba Cloud Management Console.

Yes N/A
indexName The name of the Open Search project. Yes N/A
table The name of the table to which data is written. You can specify only one table because Data Integration cannot import data to multiple tables at a time. Yes N/A
column The columns in the destination table to which data is written. To write data to all the columns in the destination table, set the value to an asterisk (*), for example, "column":["*"]. Set the value to the specified columns if data needs to be written to only specific columns in the destination table. Separate the columns with commas (,), for example, "column":["id","name"].

Open Search Writer can filter columns and change the order of columns. For example, an Open Search table has three columns: a, b, and c. If you want to write data only to columns c and b, you can set the column parameter in the format of "column":["c","b"]. During data synchronization, column a is automatically set to null.

Yes N/A
batchSize The number of data records to write at a time. Multiple data records are written to Open Search at a time. The advantage of Open Search is data query. The transactions per second (TPS) of Open Search is generally not high. Set this parameter based on the resources available for the account that is used to connect to Open Search.

Generally, the size of a data record must be less than 1 MB, and the size of the data records to write at a time must be less than 2 MB.

Required only for writing data to a partitioned table 300
writeMode The write mode. To ensure the idempotence of write operations, set the writeMode parameter to add/update when you configure Open Search Writer.
  • add: deletes the existing data record and inserts the new data record to Open Search, which is an atomic operation.
  • update: updates the existing data record based on the new data record, which is an atomic operation.
    Note Writing multiple data records to Open Search at a time is not an atomic operation. Part of the data may fail to be written. Exercise caution when you set the writeMode parameter. Open Search V3 does not support the update mode.
Yes N/A
ignoreWriteError Specifies whether to ignore failed write operations.

Example: "ignoreWriteError":true. If multiple data records are written to Open Search at a time, this parameter specifies whether to ignore failed write operations in the current batch. If you set the parameter to true, Open Search Writer continues to perform other write operations. If you set the parameter to false, the sync node ends and an error message is returned. We recommend that you use the default value.

No false
version The version of Open Search, for example, "version":"v3". We recommend that you use Open Search V3 because the push operation faces many constraints in Open Search V2. No v2

Configure Open Search Writer by using the code editor

The following example shows how to configure a sync node to write data to Open Search. For more information, see Create a sync node by using the code editor.
{
    "type": "job",
    "version": "1.0",
    "configuration": {
        "reader": {},
        "writer": {
            "plugin": "opensearch",
            "parameter": {
                "accessId": "*********",
                "accessKey": "********",
                "host": "http://yyyy.aliyuncs.com",
                "indexName": "datax_xxx",
                "table": "datax_yyy",
                "column": [
                "appkey",
                "id",
                "title",
                "gmt_create",
                "pic_default"
                ],
                "batchSize": 500,
                "writeMode": add,
                "version":"v2",
                "ignoreWriteError": false
            }
        }
    }
}