All Products
Search
Document Center

Elasticsearch:Use the logstash-input-maxcompute plug-in

Last Updated:Mar 26, 2026

The logstash-input-maxcompute plug-in reads data from the offline tables of MaxCompute and transfers it to a destination data source.

Important

The plug-in performs a full read every time it runs. Incremental reads are not supported.

Prerequisites

Before you begin, ensure that you have:

Configure the pipeline

Use a configuration file to define a pipeline that reads from MaxCompute. The following example reads all data from a partitioned MaxCompute table and prints it to stdout for verification:

input {
    maxcompute {
        access_id => "Your accessId"
        access_key => "Your accessKey"
        endpoint => "maxcompute service endpoint"
        project_name => "Your project"
        table_name => "Your table name"
        partition => "pt='p1',dt='d1'"
        thread_num => 1
        dirty_data_file => "/ssd/1/<Logstash cluster ID>/logstash/data/XXXXX.txt"
    }
}

output {
    stdout {
        codec => rubydebug
    }
}

After configuring the parameters, save and deploy the pipeline. For instructions, see Use configuration files to manage pipelines.

Important

By default, Alibaba Cloud Logstash transmits data only within the same virtual private cloud (VPC). If your MaxCompute data source is accessible over the Internet, configure a Network Address Translation (NAT) gateway for your Logstash cluster first. See Configure a NAT gateway for data transmission over the Internet.

Parameters

The following table lists all parameters supported by logstash-input-maxcompute.

ParameterTypeRequiredDescription
endpointstringYesThe endpoint used to access MaxCompute.
access_idstringYesThe AccessKey ID of your Alibaba Cloud account.
access_keystringYesThe AccessKey secret of your Alibaba Cloud account.
project_namestringYesThe name of the MaxCompute project.
table_namestringYesThe name of the MaxCompute table.
partitionstringYesThe partition field that the MaxCompute table is partitioned by.
thread_numnumberYesThe number of threads used to read data. Default value: 1.
dirty_data_filestringYesThe path of the file that records logs about processing failures.
retry_intervalnumberNoThe interval between retries, in seconds.

endpoint

  • Required

  • Type: string

  • Default value: none

The endpoint used to access MaxCompute. For endpoints by region, see Endpoints in different regions (Internet).

access_id

  • Required

  • Type: string

  • Default value: none

The AccessKey ID of your Alibaba Cloud account.

access_key

  • Required

  • Type: string

  • Default value: none

The AccessKey secret of your Alibaba Cloud account.

project_name

  • Required

  • Type: string

  • Default value: none

The name of the MaxCompute project.

table_name

  • Required

  • Type: string

  • Default value: none

The name of the MaxCompute table.

partition

  • Required

  • Type: string

  • Default value: none

The partition field that the MaxCompute table is partitioned by. Example: sale_date='201911' and region='hangzhou'.

thread_num

  • Required

  • Type: number

  • Default value: 1

The number of threads used to read data.

dirty_data_file

  • Required

  • Type: string

  • Default value: none

The path of the file that records logs about processing failures. Set the path to /ssd/1/<Logstash cluster ID>/logstash/data/.

retry_interval

  • Optional

  • Type: number

  • Default value: none

The interval between retries, in seconds.