Comparison of processing plugins, ingest processors, data transformation, and consumer processors - Simple Log Service

Simple Log Service provides four data processing methods: processing plugins, ingest processors, data transformation, and consumer processors. This topic compares the features and applicable scenarios of these methods to help you select the most suitable one.

Background information

Processing plugin configuration: Simple Log Service data collectors provide various processing configurations. These configurations support processing plugins and data processing on clients that use Structured Process Language (SPL) statements.
Ingest processor: An ingest processor can be associated with a Logstore. By default, data written to the Logstore is processed by the ingest processor on the server-side.
Data transformation: Data is first written to a source Logstore and then processed based on transformation rules. The processed data is written to a destination Logstore.
Consumer Processor: A consumer processor uses SPL to process data in real time as it is consumed from a Logstore. Consumer processors support integration with third-party services such as the SDK, Flink, and DataWorks.

Comparison

Processing plugins, ingest processors, data transformation, and consumer processors cover the entire data lifecycle, including before storage (at collection), during storage (at write), and after storage (after write). These methods have similarities. For example, they all process data and support SPL. However, they differ in their applicable scenarios and capabilities.

Comparison dimension	Processing plugin	Ingest processor	Data transformation	Consumer Processor
Data processing stage	Before storage (during data collection).	During storage.	After storage.	After storage.
Write to multiple Logstores	Not supported for a single collection configuration. You can use multiple collection configurations with processing plugins.	Not supported.	Supported.	Not supported.
SPL	Supported.	Supported.	Supported.	Supported.
Supported SPL instructions	Supports instructions for single-line data processing, which take one line of data as input and produce zero or one line of data as output.	Supports instructions for single-line data processing, which take one line of data as input and produce zero or one line of data as output.	Supports complete SPL instructions.	Supports complete SPL instructions.
Prevents sensitive data from being written to disks	Supported.	Supported.	Not supported. Data passes through the source Logstore.	Not supported. Data passes through the source Logstore.
Resource usage	Consumes some client resources.	Server-side resources are automatically scaled. This process is transparent to users.	Server-side resources are automatically scaled. This process is transparent to users.	Server-side resources are automatically scaled. This process is transparent to users.
Performance impact	Collection performance is slightly affected by the number of plugins and configuration complexity. Data write performance is not affected.	Write performance is slightly affected by the complexity of the data and SPL statements. The latency of a single request can increase by several to tens of milliseconds, depending on the size of the requested data packet and the complexity of the SPL statements.	The write performance of the source Logstore is not affected.	The write performance of the source Logstore is not affected.
Scenario coverage	High.	Standard.	There are many.	Multiple
Cost	No SLS data processing fees are charged, but some client resources are consumed.	Data processing fees are charged. In data filtering scenarios, this fee is usually lower than the savings from reduced data traffic and storage costs.	Source Logstore fees and data processing fees are charged. You can reduce the cost of the source Logstore by setting its data retention period to one day and disabling the index.	Source Logstore fees and data processing fees are charged. You can reduce the cost of the source Logstore by setting its data retention period to one day and disabling the index.
Fault tolerance	You can configure the plugin to retain the original fields if processing fails.	You can configure it to retain raw data if processing fails.	Because the source data is already stored, data can be reprocessed if a transformation rule fails. You can also create multiple data transformation jobs to process data separately.	Because the source data is already stored, consumer groups that integrate SPL consumption rules through Flink, DataWorks, or the SDK automatically retry when errors occur.

The following compares the capabilities of write processors, Logtail processing configurations, and data transformation in typical scenarios:

Scenario	Logtail Processing Configuration	Ingest processor	Data transformation	Consumer Processor
Simple data processing, such as single-line data processing that does not involve complex computational logic.	Recommended	Recommended	Recommended	Recommended
Complex data processing, such as tasks that involve complex computational logic or require multiple conditions, window aggregation, or dimension table enrichment.	General	General	Recommended	Recommended
Limited client resources, such as when the compute resources available to Logtail are limited.	General	Recommended	Recommended	Recommended
Limited client-side control, such as no permission to modify Logtail configurations or SDK write logic on the client.	Not recommended	Recommended	Recommended	Recommended
Limited server-side control, such as no permission to modify Logstore or data transformation configurations.	Recommended	Not recommended	Not recommended	Not recommended
Sensitive to data write latency and performance, such as when raw data must be collected as soon as possible.	General	General	Recommended	Recommended
Data masking, and sensitive data can be written to disks.	Recommended	Recommended	Recommended	Recommended
Data masking, and sensitive data cannot be written to disks.	Recommended	Recommended	Not recommended	Not recommended
Data enrichment that does not depend on external data sources, such as adding a new field whose value is a static field or is extracted from an existing field.	General	Recommended	Recommended	Recommended
Data enrichment that depends on external data sources, such as querying additional enrichment data from a MySQL table based on a log field.	Not recommended	Not recommended	Recommended	Recommended
Data distribution, which writes data to different Logstores based on different conditions.	General	Not recommended	Recommended	Not recommended
Data filtering, where raw data is not required, to save costs.	General	Recommended	General	General