All Products
Search
Document Center

DataWorks:OpenSearch data source

Last Updated:Nov 14, 2025

You can use the OpenSearch Writer plugin in DataWorks Data Integration to write data to OpenSearch. This topic describes how to write data to OpenSearch in offline mode.

Supported versions

  • Version 3 uses a second-party package. The pom dependency is com.aliyun.opensearch:aliyun-sdk-opensearch:2.1.3.

  • To use the OpenSearch Writer plugin, you must have JDK 1.6-32 or a later version. You can run the java -version command to check your Java version number.

  • The following commercial editions of Alibaba Cloud OpenSearch are supported: Industry Algorithm Edition, LLM-based AI Chat Edition, High-performance Search Edition, Vector Search Edition, and Retrieval Engine Edition.

Limitations

  • OpenSearch Writer supports serverless resource groups (recommended) and exclusive resource groups for Data Integration but does not support custom resource groups.

  • Columns in OpenSearch are unordered. Therefore, OpenSearch Writer requires you to specify the order of columns for writing data. If you specify fewer columns than the number of columns in the destination OpenSearch table, the unspecified columns are set to their default values or null.

    For example, if an OpenSearch table contains columns a, b, and c, and you want to import data into columns b and c, you can set the "column":["c","b"] parameter. This configuration imports the first and second columns from the reader into columns c and b of the OpenSearch table, respectively. Column a is set to its default value or null.

  • You can write offline data to OpenSearch only in code editor mode.

Supported field types

OpenSearch Writer supports most OpenSearch data types. The following table lists the supported data type mappings.

Category

OpenSearch data type

Integer

INT

Floating-point

DOUBLE and FLOAT

String

TEXT, LITERAL, and SHORT_TEXT

Date and time

INT

Boolean

LITERAL

Develop a data synchronization task

For information about the entry point for and the procedure of configuring a synchronization task, see the following configuration guides.

FAQ

Handle column configuration errors

To ensure data reliability, OpenSearch Writer validates the number of columns. The writer reports an error if you try to write more columns than exist in the destination table. For example, if an OpenSearch table has columns a, b, and c, OpenSearch Writer reports an error if you try to write more than three columns.

Notes on table configuration

OpenSearch Writer can write data to only one table at a time.

Task reruns and failover

When a task is rerun, existing data is overwritten based on the document ID. Therefore, the columns that you write to OpenSearch must include an ID column, which serves as the unique identifier for a row. Data with a matching ID is overwritten.

Appendix: Code sample and parameters

Configure a batch synchronization task by using the code editor

If you want to configure a batch synchronization task by using the code editor, you must configure the related parameters in the script based on the unified script format requirements. For more information, see Configuration in the code editor. The following information describes the parameters that you must configure for data sources when you configure a batch synchronization task by using the code editor.

Code sample for Writer (Industry Algorithm Edition, LLM-based AI Chat Edition, and High-performance Search Edition)

{
    "type": "job",
    "version": "1.0",
    "configuration": {
        "reader": {},
        "writer": {
            "plugin": "opensearch",
            "parameter": {
                "accessId": "*********",
                "accessKey": "********",
                "host": "http://yyyy.aliyuncs.com",
                "endpoint":"http://yyyy.aliyuncs.com",
                "indexName": "datax_xxx",
                "table": "datax_yyy",
                "column": [
                "appkey",
                "id",
                "title",
                "gmt_create",
                "pic_default"
                ],
                "batchSize": 500,
                "writeMode": add,
                "version":"v2",
                "ignoreWriteError": false
            }
        }
    }
}

Parameters for Writer (Industry Algorithm Edition, LLM-based AI Chat Edition, and High-performance Search Edition)

Parameter

Description

Required

Default value

accessId

The AccessKey ID of your AccessKey pair.

Yes

N/A

accessKey

The AccessKey secret of your AccessKey pair. It is used as a logon password.

Yes

N/A

host

The traffic domain name of OpenSearch. You can log on to the OpenSearch console and go to the instance details page to obtain the domain name.

Yes

N/A

endpoint

The control endpoint of OpenSearch. You can obtain the endpoint from the official website of the corresponding OpenSearch edition. For example, for the Industry Algorithm Edition, see Service endpoints.

Yes

N/A

indexName

The name of the OpenSearch project.

Yes

N/A

table

The name of the destination table. You can specify only one table because DataX does not support writing data to multiple tables at the same time.

Yes

N/A

column

The columns to which you want to write data. To write data to all columns, set this parameter to "column":["*"]. To write data to specific columns, specify the column names, such as "column":["id","name"].

OpenSearch supports column filtering and reordering. For example, a table has columns a, b, and c. If you want to synchronize data only to columns c and b, you can set this parameter to ["c","b"]. During the import, column a is automatically set to null.

Yes

N/A

batchSize

The number of data records to write in each batch. OpenSearch performs batch writes. The main strength of OpenSearch is in queries, and its write transactions per second (TPS) is not high. Set this parameter based on the resources allocated to your account.

Typically, a single data record is smaller than 1 MB, and a single batch write is smaller than 2 MB.

This parameter is required for partitioned tables. Do not specify this parameter for non-partitioned tables.

300

writeMode

The write mode. Configure "writeMode":"add/update" to ensure write idempotence:

  • "add": When a write failure occurs and the task is rerun, OpenSearch Writer clears the data and imports new data. This is an atomic operation.

  • "update": The data is inserted as an update. This is an atomic operation.

    Note

    Batch inserts in OpenSearch are not atomic operations. Some records may be successfully inserted while others fail. The choice of the writeMode parameter is important. The update operation is not supported in Version 3.

Yes

N/A

ignoreWriteError

You can ignore write faults.

Example: "ignoreWriteError":true. OpenSearch performs batch writes. This parameter specifies whether to ignore write failures for the current batch. If you ignore the failures, other write operations continue. If you do not ignore the failures, the current task stops and returns an error. We recommend that you use the default value.

No

false

version

The version of OpenSearch, such as "version":"v3". We recommend that you use Version 3 because the push operation has many limits in Version 2.

No

v2

Code sample for Writer (Vector Search Edition and Retrieval Engine Edition)

{
  "stepType": "opensearch",
  "parameter": {
    "indexName": "",
    "column": [
      {
        "name": "col3double",
        "type": "DOUBLE"
      },
      {
        "name": "col2vector",
        "type": "MULTI_FLOAT"
      }
    ],
    "datasource": "zm_test_vector_01",
    "batchSize": "500",
    "table": "demotable"
  },
  "name": "Writer",
  "category": "writer"
}

Parameters for Writer (Vector Search Edition and Retrieval Engine Edition)

Parameter

Description

Required

Default value

table

The name of the destination table. You can specify only one table because DataX does not support writing data to multiple tables at the same time.

Yes

N/A

column

The columns to which you want to write data. To write data to all columns, set this parameter to "column":["*"]. To write data to specific columns, specify the column names, such as "column":["id","name"].

OpenSearch supports column filtering and reordering. For example, a table has columns a, b, and c. If you want to synchronize data only to columns c and b, you can set this parameter to ["c","b"]. During the import, column a is automatically set to null.

Yes

N/A

batchSize

The number of data records to write in each batch. OpenSearch performs batch writes. The main strength of OpenSearch is in queries, and its write transactions per second (TPS) is not high. Set this parameter based on the resources allocated to your account.

Typically, a single data record is smaller than 1 MB, and a single batch write is smaller than 2 MB.

This parameter is required for partitioned tables. Do not specify this parameter for non-partitioned tables.

300