Simple Log Service provides four data processing solutions that cover the full data lifecycle: processing plugins (at collection), ingest processors (at write), data transformation (post-storage), and consumer processors (post-storage consumption). Use this comparison to select the right solution based on your performance, cost, and capability requirements.
Background information
Each solution operates at a different stage of the data pipeline:
Processing plugin: The Simple Log Service data collector supports processing plugins and SPL statements that process data on the client side before ingestion.
Ingest processor : An ingest processor is associated with a logstore. By default, all data written to the logstore is processed server-side by the ingest processor at write time.
Data transformation: Data is first written to a source logstore, then processed based on transformation rules. The processed data is written to a destination logstore.
Consumer processor: A consumer processor performs real-time data processing on logstore data through SPL during consumption. Consumer processors integrate with third-party services such as SDK, Flink, and DataWorks.
Capability comparison
Processing plugins, ingest processors, data transformation, and consumer processors span the full data lifecycle: before storage (at collection), during storage (at write time), and after storage. All four solutions support data processing and the SPL language, but they differ in scope, resource usage, and supported scenarios.
Dimension | Processing plugin | Ingest processor | Data transformation | Consumer processor |
Processing stage | Before storage (at collection) | During storage (at write) | After storage | After storage |
Write to multiple logstores | Not supported by a single collection configuration. Use multiple collection configurations with processing plugins to achieve this. | Not supported | Supported | Not supported |
SPL support | Supported | Supported | Supported | Supported |
Supported SPL instructions | Single-row instructions only. Input: one row; output: zero or one row. | Single-row instructions only. Input: one row; output: zero or one row. | Full SPL instruction set | Full SPL instruction set |
Prevent sensitive data from being written to disk | Supported | Supported | Not supported. Data passes through the source logstore. | Not supported. Data passes through the source logstore. |
Resource usage | Consumes client-side resources | Server-side auto-scaling, transparent to users | Server-side auto-scaling, transparent to users | Server-side auto-scaling, transparent to users |
Performance impact | Collection performance varies slightly with the number and complexity of plugins. Write performance is unaffected. | Write latency increases by several milliseconds to tens of milliseconds, depending on the data packet size and SPL statement complexity. | Source logstore write performance is unaffected. | Source logstore write performance is unaffected. |
Scenario coverage | Broad | Moderate | Broad | Broad |
Cost | No SLS data processing fees. Client resources are consumed. | Data processing fees apply. In data filtering scenarios, these fees are typically lower than the savings from reduced traffic and storage. | Source logstore fees plus data processing fees. To reduce source logstore costs, set the data retention period to one day and disable indexing. | Source logstore fees plus data processing fees. To reduce source logstore costs, set the data retention period to one day and disable indexing. |
Fault tolerance | Configure whether to retain original fields when processing fails. | Configure whether to retain original data when processing fails. | Source data is already stored, so reprocess data if a transformation rule fails. Create multiple transformation jobs to process data separately. | Source data is already stored. Flink, DataWorks, and SDK consumer groups with SPL consumption rules automatically retry on errors. |
The following table compares the four solutions across typical scenarios. Use the recommendation levels to identify the best option for each use case.
Scenario | Processing plugin | Ingest processor | Data transformation | Consumer processor |
Simple data processing that involves single-row operations without complex computational logic | Recommended | Recommended | Recommended | Recommended |
Complex data processing that involves multi-condition logic, window aggregation, or dimension table enrichment | Adequate | Adequate | Recommended | Recommended |
Limited client resources, such as when Logtail has restricted compute capacity | Adequate | Recommended | Recommended | Recommended |
Limited client-side control, such as no permission to modify Logtail configurations or SDK write logic | Not recommended | Recommended | Recommended | Recommended |
Limited server-side control, such as no permission to modify logstore or transformation configurations | Recommended | Not recommended | Not recommended | Not recommended |
Latency-sensitive writes that require raw data to be collected as quickly as possible | Adequate | Adequate | Recommended | Recommended |
Data masking when sensitive data is allowed to be written to disk | Recommended | Recommended | Recommended | Recommended |
Data masking when sensitive data must not be written to disk | Recommended | Recommended | Not recommended | Not recommended |
Data enrichment from internal sources, such as adding a field with a static value or a value extracted from an existing field | Adequate | Recommended | Recommended | Recommended |
Data enrichment from external sources, such as querying a MySQL table for additional data based on a log field | Not recommended | Not recommended | Recommended | Recommended |
Data distribution that routes data to different logstores based on conditions | Adequate | Not recommended | Recommended | Not recommended |
Data filtering to discard raw data and reduce costs | Adequate | Recommended | Adequate | Adequate |