All Products
Search
Document Center

Elasticsearch:Use the logstash-input-sls plug-in

Last Updated:Mar 27, 2026

logstash-input-sls is a built-in input plug-in for Alibaba Cloud Logstash that pulls log data from Simple Log Service (SLS). It is open source and maintained by Alibaba Cloud.

The plug-in handles distributed consumption, checkpointing, and shard allocation automatically, so you can focus on building the pipeline configuration rather than managing the consumer infrastructure.

Key capabilities

  • Distributed consumption: Deploy one pipeline per server across multiple servers and have them consume the same Logstore in parallel. All servers share the same consumer_group and consumer_name, with consumer_name_with_ip set to true to make each consumer uniquely identifiable.

  • High throughput: A single-core CPU can sustain 20 MB/s using the Java consumer group implementation.

  • Checkpoint-based reliability: The plug-in saves the consumption progress on each server. If a server restarts after a failure, consumption resumes from the last checkpoint.

  • Automatic shard rebalancing: Shards are redistributed across active consumers whenever a consumer joins or leaves the group.

Prerequisites

Before you begin, ensure that you have:

Configure the pipeline

Create a pipeline configuration file following the instructions in Use configuration files to manage pipelines. After saving and deploying the pipeline, Alibaba Cloud Logstash starts retrieving data from SLS.

The following example pulls log data from an SLS Logstore and writes it to Alibaba Cloud Elasticsearch:

input {
  logservice{
    endpoint => "your project endpoint"
    access_id => "your access id"
    access_key => "your access key"
    project => "your project name"
    logstore => "your logstore name"
    consumer_group => "consumer group name"
    consumer_name => "consumer name"
    position => "end"
    checkpoint_second => 30
    include_meta => true
    consumer_name_with_ip => true
  }
}

output {
  elasticsearch {
    hosts => ["http://es-cn-***.elasticsearch.aliyuncs.com:9200"]
    index => "<your_index>"
    user => "elastic"
    password => "changeme"
  }
}

Parameters

Parameter

Type

Required

Default

Description

endpoint

String

Yes

The VPC endpoint of the SLS project. See Internal Simple Log Service endpoints.

access_id

String

Yes

The AccessKey ID with consumer group permissions. See Use consumer groups to consume data.

access_key

String

Yes

The AccessKey secret with consumer group permissions. See Use consumer groups to consume data.

project

String

Yes

The name of the SLS project.

logstore

String

Yes

The name of the Logstore.

consumer_group

String

Yes

The consumer group name. Can be customized.

consumer_name

String

Yes

The consumer name. Must be unique within the consumer group; duplicate names cause undefined behavior.

position

String

Yes

The start position for consumption. Valid values: begin (first log entry ever written), end (current time), or a timestamp in yyyy-MM-dd HH:mm:ss format.

checkpoint_second

Number

No

30

The checkpoint interval in seconds. Recommended range: 10–60. Minimum: 10.

include_meta

Boolean

No

true

Whether to include log metadata (source, time, tag, and topic) in the input.

consumer_name_with_ip

Boolean

No

true

Whether to append the server's IP address to the consumer name. Set to true for distributed consumption.

Best practices

Distributed consumption

Deploy exactly one pipeline with logstash-input-sls on each server. If multiple pipelines on the same server consume the same Logstore, duplicate data will appear in the output.

For all servers to participate in the same consumer group, configure them with identical consumer_group and consumer_name values, and set consumer_name_with_ip to true. The plug-in appends each server's IP to the consumer name, making every consumer unique and allowing the group to track each server's consumption position separately.

Example: A Logstore with 10 shards at 1 MB/s per shard, consumed by 5 servers at 3 MB/s capacity each. With one pipeline per server and consumer_name_with_ip set to true, the plug-in allocates 2 shards per server, each processing at 2 MB/s.

Checkpoint interval

Set checkpoint_second to a value between 10 and 60 seconds. A shorter interval reduces the amount of data re-processed after a server restart, but increases the frequency of checkpoint writes. The default of 30 seconds is appropriate for most workloads.

Performance benchmark

The following test results show throughput and resource usage on a 4-core Intel Xeon Platinum 8163 @ 2.50 GHz with 8 GB memory running Linux.

Test setup: A Java producer sends log entries (10 key-value pairs, ~500 bytes each) at increasing rates. Logstash consumes from the Logstore using the pipeline configuration below, writing output to Elasticsearch. The test verifies that consumption latency does not increase and consumption speed keeps pace with the ingest rate.

input {
  logservice{
    endpoint => "cn-hangzhou-intranet.log.aliyuncs.com"
    access_id => "***"
    access_key => "***"
    project => "test-project"
    logstore => "logstore1"
    consumer_group => "consumer_group1"
    consumer_name => "consumer1"
    position => "end"
    checkpoint_second => 30
    include_meta => true
    consumer_name_with_ip => true
  }
}
output {
  elasticsearch {
    hosts => ["http://es-cn-***.elasticsearch.aliyuncs.com:9200"]
    index => "myindex"
    user => "elastic"
    password => "changeme"
  }
}

Results:

Traffic (MB/s)

CPU utilization (%)

Memory usage (GB)

2

11.3

1.3

4

21.0

1.3

8

41.5

1.3

16

83.3

1.3

32

170.3

1.3

CPU usage scales linearly with traffic. Memory usage remains constant at 1.3 GB across all traffic levels.

What's next