Ship logs from Simple Log Service (SLS) to an on-premises Security Information and Event Management (SIEM) platform, such as Splunk or QRadar. This lets you consolidate cloud logs with your existing on-premises security analytics platform for unified monitoring, auditing, and threat analysis.
How it works
A consumer application, deployed within your on-premises network, is responsible for fetching log data. It uses an SLS consumer group to pull logs in real time and then forwards them to your SIEM using protocols such as the Splunk HTTP Event Collector (HEC) or Syslog over TCP/TLS.
The process is built on a pull-based architecture, which provides the following key benefits by design:
Security: The pull model ensures all connections are initiated from your secure network. You do not need to open any inbound firewall ports, which preserves your on-premises security posture.
High throughput and scalability: High throughput is achieved by scaling horizontally. You can run multiple consumer instances concurrently, and the consumer group automatically balances the workload across all active instances.
Reliability: The consumer group provides at-least-once delivery guarantees. If a consumer instance fails, the shards it was processing are automatically reassigned to other healthy instances in the group. Consumption resumes from the last saved checkpoint, preventing data loss.
Prerequisites
Permission: Create a Resource Access Management (RAM) user, and attach the
AliyunLogFullAccesspolicy to the user. For more information, see the related document.Network requirements: The machine where the program runs must be able to access the SLS endpoint and be in the same network as the SIEM.
To obtain the endpoint:
Log on to the SLS console. In the project list, click the target project.
Click the
icon to the right of the project name to go to the project overview page.In the Endpoint section, view the public endpoint. The endpoint is
https://+ public endpoint.
Environment requirements: A Python 3 runtime environment and the SLS Python SDK.
Install the SLS Python SDK:
pip install -U aliyun-log-python-sdk.Verify the installation:
pip show aliyun-log-python-sdk. If the following information is returned, the installation is successful.Name: aliyun-log-python-sdk Version: 0.9.12 Summary: Aliyun log service Python client SDK Home-page: https://github.com/aliyun/aliyun-log-python-sdk Author: Aliyun
Procedure
Step 1: Prepare the application
SLS provides two shipping methods:
Splunk HEC: HEC is a token-based mechanism that lets you send logs in various data formats directly to Splunk over HTTP securely and efficiently.
Syslog: A common log channel compatible with most SIEMs and supports text format.
Splunk HEC
To ship log data to Splunk, configure sync_data.py. The code has three main parts:
main() method: Main program entrypoint.
get_option() method: Consumption configuration options.
Basic configuration options: Includes connection settings for SLS and consumer group settings.
Advanced options for the consumer group: Performance-tuning parameters. Do not modify them unless necessary.
SIEM (Splunk)-related parameters and options.
To perform data cleansing during the shipping process (such as row filtering, column trimming, or data normalization), add rules using SPL queries. For example:
# SPL statement query = "* | where instance_id in ('instance-1', 'instance-2')" # Create a consumer based on rules. Compared to normal consumption, the query parameter is added at the end of the parameter list. option = LogHubConfig(endpoint, accessKeyId, accessKey, project, logstore, consumer_group, consumer_name, cursor_position=CursorPosition.SPECIAL_TIMER_CURSOR, cursor_start_time=cursor_start_time, heartbeat_interval=heartbeat_interval, data_fetch_interval=data_fetch_interval, query=query)
SyncData(ConsumerProcessorBase): Contains the logic for fetching data from SLS and shipping it to Splunk. Carefully review the comments in the code and make adjustments as needed.
The complete code is as follows:
Syslog
Syslog defines log format specifications primarily based on RFC5424 and RFC3164. We recommend using the RFC 5424 protocol. While both TCP and UDP can transport Syslog, TCP provides more reliable data transmission than UDP. The RFC 5424 protocol also defines a secure transport layer using TLS. If your SIEM supports a TCP or TLS channel for Syslog, we recommend using that channel.
To ship log data to a SIEM by using Syslog, configure sync_data.py. The code has three main parts:
main() method: Main program entrypoint.
get_monitor_option() method: Consumption configuration options.
Basic configuration options: Includes connection settings for SLS and consumer group settings.
Advanced options for the consumer group: Performance-tuning parameters. Do not modify them unless necessary.
Parameters and options related to the SIEM's Syslog server.
Syslog facility: Program component. The example uses
syslogclient.FAC_USERas the default.Syslog severity: Log level. Set the log level for specific content as needed. Here,
syslogclient.SEV_INFOis selected.If the SIEM supports Syslog channels that are based on TCP or TLS, set proto to TLS and configure the correct SSL Certificate.
SyncData(ConsumerProcessorBase): Contains the logic for how to retrieve data from SLS and deliver it to the SIEM Syslog server. Read the comments in the code carefully and make adjustments as needed.
The complete code is as follows:
Step 2: Configure environment variables
After configuring the application, set the system environment variables as described in the following table.
Environment variable | Value | Example |
SLS_ENDPOINT |
If the endpoint is prefixed with |
|
SLS_PROJECT | In the SLS console, copy the target project name. | my-sls-project-one |
SLS_LOGSTORE | In the SLS console, copy the target logstore name. | my-sls-logstore-a1 |
SLS_AK_ID | Use the AccessKey ID of a RAM user. Important
| L***ky |
SLS_AK_KEY | Use the AccessKey secret of a RAM user. | x***Xl |
SLS_CG | The consumer group name. If the specified group does not exist, the application creates it automatically. | syc_data |
Step 3: Start and verify
Start multiple consumers for concurrent consumption. The maximum number of consumers equals the total number of shards.
# Start the first consumer process nohup python3 sync_data.py & # Start the second consumer process nohup python3 sync_data.py &Check the status of the consumer group in the SLS console.
In the project list, click the target project. On the tab, click the
icon next to the target logstore, then click the
icon next to Data Consumption.In the consumer group list, click the one you want. On the Consumer Group Status page, view the client and time for data consumption of each shard.
FAQ
ConsumerGroupQuotaExceed error occurs
This error indicates that a limit is exceeded. A single logstore can have a maximum of 30 consumer groups. Delete unused consumer groups in the SLS console.