All Products
Search
Document Center

Elasticsearch:Use the logstash-input-maxcompute plug-in

Last Updated:Jul 18, 2023

The logstash-input-maxcompute plug-in allows you to read data from the offline tables of MaxCompute.

Prerequisites

  • The logstash-input-maxcompute plug-in is installed.

    For more information, see Install and remove a plug-in.

  • Alibaba Cloud MaxCompute is activated, a project is created, a table is created for the project, and data is imported to the table.

    For more information, see Prepare and Getting Started.

Use logstash-input-maxcompute

After the prerequisites are met, you can create a pipeline by following the instructions provided in Use configuration files to manage pipelines. When you create the pipeline, configure the pipeline parameters based on the descriptions in the table of the Parameters section. After you configure the parameters, save the settings and deploy the pipeline. This way, Logstash can be triggered to read data from MaxCompute and transfer the data to the destination data source.

The following code provides a pipeline configuration example. For more information about the parameters, see Parameters.

input {
    maxcompute {
        access_id => "Your accessId"
        access_key => "Your accessKey"
        endpoint => "maxcompute service endpoint"
        project_name => "Your project"
        table_name => "Your table name"
        partition => "pt='p1',dt='d1'"
        thread_num => 1
        dirty_data_file => "/ssd/1/<Logstash cluster ID>/logstash/data/XXXXX.txt"
    }
}

output {
    stdout {
        codec => rubydebug
    }
}
Important
  • By default, Alibaba Cloud Logstash supports data transmission only over the same virtual private cloud (VPC). If source data is on the Internet, configure a Network Address Translation (NAT) gateway for your Logstash cluster to enable the cluster to access the Internet. For more information, see Configure a NAT gateway for data transmission over the Internet.

  • logstash-input-maxcompute fully reads data from MaxCompute.

Parameters

The following table describes the parameters supported by logstash-input-maxcompute.

Parameter

Type

Required

Description

endpoint

string

Yes

The endpoint that is used to access MaxCompute. For more information, see Endpoints in different regions (Internet).

access_id

string

Yes

The AccessKey ID of your Alibaba Cloud account.

access_key

string

Yes

The AccessKey secret of your Alibaba Cloud account.

project_name

string

Yes

The name of the MaxCompute project.

table_name

string

Yes

The name of the MaxCompute table.

partition

string

Yes

The partition field. The MaxCompute table is partitioned by using this field. Example: sale_date='201911' and region='hangzhou'.

thread_num

number

Yes

The number of threads. Default value: 1.

retry_interval

number

No

The interval for retries. Unit: seconds.

dirty_data_file

string

Yes

The path of the file that records logs about processing failures.

Note

Set the path to /ssd/1/<Logstash cluster ID>/logstash/data/.