Deploy a consumer group application to pull logs from SLS and forward them to your on-premises SIEM through Splunk HEC or Syslog.
Background
SIEM platforms such as Splunk and QRadar typically run in on-premises data centers without public-facing endpoints. After you migrate to the cloud, you need a secure pipeline to ship cloud resource logs to your on-premises SIEM for monitoring, auditing, and threat analysis.
How it works
A consumer group application pulls logs from SLS in real time and forwards them to your SIEM through Splunk HEC or Syslog over TCP/TLS.
Core logic
-
Log pulling: A consumer group application pulls data from SLS with built-in concurrency and failover.
-
Concurrency and throughput
-
For higher throughput, run multiple consumer instances in the same group. Each instance must have a unique name, such as a process ID suffix.
-
Each shard is processed by one consumer at a time. Maximum concurrency equals the shard count — for example, 10 shards support up to 10 parallel consumers.
-
Under ideal network conditions:
-
A single consumer (~20% of one CPU core) can consume raw logs at 10 MB/s.
-
Ten consumers can process up to 100 MB/s.
-
-
-
High availability
-
The consumer group stores each consumer's progress as a server-side checkpoint.
-
If a consumer fails, another instance takes over its shards and resumes from the last checkpoint. Deploy instances on separate machines for failover.
-
Extra instances beyond the shard count act as standbys for immediate failover.
-
-
-
Data forwarding: The application formats and forwards pulled logs to your on-premises SIEM.
Prerequisites
-
Create a RAM user and grant permissions. The RAM user must have the
AliyunLogFullAccesspolicy. -
Network: The machine running the application must reach both the SLS endpoint and the SIEM.
-
To obtain the endpoint:
-
Log on to the SLS console. In the project list, click the target project.
-
Click the
icon to the right of the project name to go to the project overview page. -
In the Endpoint section, copy the public endpoint. The endpoint is
https://+ the public endpoint.
-
-
-
Python 3 runtime with the SLS Python SDK installed.
-
Install the SLS Python SDK:
pip install -U aliyun-log-python-sdk. -
Verify:
pip show aliyun-log-python-sdk. Expected output:Name: aliyun-log-python-sdk Version: 0.9.12 Summary: Aliyun log service Python client SDK Home-page: https://github.com/aliyun/aliyun-log-python-sdk Author: Aliyun
-
Procedure
Step 1: Prepare the application
SLS provides sample scripts for two shipping methods. Select the one that matches your SIEM:
-
Splunk HEC: A token-based HTTP collector for sending data to Splunk.
-
Syslog: A standard logging protocol compatible with most SIEM systems.
Splunk HEC
Configure the sync_data.py script. It has three main parts:
-
main() method: The main program control logic.
-
get_option() method: Defines consumption configuration options.
-
Basic configuration: Connection settings for SLS and the consumer group.
-
Advanced consumer group options: Performance-tuning parameters. Do not modify unless necessary.
-
SIEM (Splunk) parameters and options.
-
Add an SPL query to filter or transform data during shipping for tasks like row filtering, column trimming, or data normalization. Example:
# SPL query query = "* | where instance_id in ('instance-1', 'instance-2')" # Create a consumer with the filter rule. The 'query' parameter is added to the configuration. option = LogHubConfig(endpoint, accessKeyId, accessKey, project, logstore, consumer_group, consumer_name, cursor_position=CursorPosition.SPECIAL_TIMER_CURSOR, cursor_start_time=cursor_start_time, heartbeat_interval=heartbeat_interval, data_fetch_interval=data_fetch_interval, query=query)
-
-
SyncData(ConsumerProcessorBase) class: Fetches data from SLS and ships it to Splunk. Review the code comments and adjust as needed.
Complete script:
Syslog
Syslog supports RFC 5424 and RFC 3164 formats. Use RFC 5424 with TCP or TLS transport for reliable, secure delivery.
Configure the sync_data.py script. It has three main parts:
-
main() method: The main program control logic.
-
get_monitor_option() method: Defines consumption configuration options.
-
Basic configuration: Connection settings for SLS and the consumer group.
-
Advanced consumer group options: Performance-tuning parameters. Do not modify unless necessary.
-
SIEM Syslog server parameters and options.
-
Syslog facility: The program component that generated the log. This example uses
syslogclient.FAC_USERas the default. -
Syslog severity: The log level. Customize based on log content. This example uses
syslogclient.SEV_INFO. -
If your SIEM supports Syslog over TCP or TLS, set the proto parameter to TLS and provide the path to a valid SSL certificate.
-
-
-
SyncData(ConsumerProcessorBase) class: Fetches data from SLS and delivers it to a Syslog server. Review the code comments and adjust as needed.
Complete script:
Step 2: Configure environment variables
Set the following environment variables:
|
Parameter |
Value |
Example |
|
SLS_ENDPOINT |
An |
|
|
SLS_PROJECT |
The name of your project in the SLS console. |
my-sls-project-one |
|
SLS_LOGSTORE |
The name of your Logstore in the SLS console. |
my-sls-logstore-a1 |
|
SLS_AK_ID |
AccessKey ID of your RAM user. Important
|
L***ky |
|
SLS_AK_KEY |
AccessKey Secret of your RAM user. |
x***Xl |
|
SLS_CG |
Consumer group name. Auto-created if it does not exist. |
sync_data |
Step 3: Start and verify
-
Start multiple consumer processes for concurrent processing. Maximum concurrency equals the shard count of your Logstore.
# Start the first consumer process nohup python3 sync_data.py & # Start the second consumer process nohup python3 sync_data.py & -
View the status of the consumer group in the SLS console.
-
In the project list, click your target project. Go to the tab. Click the
icon next to your target logstore, and then click the
icon next to Data Consumption. -
Click your consumer group. On the Consumer Group Status tab, view the consumer client and progress for each shard.
-
FAQ
ConsumerGroupQuotaExceed error
Each Logstore supports up to 30 consumer groups. Delete unused groups in the SLS console to free quota.