Simple Log Service provides the following data processing methods: ingest processor, Logtail plug-in configuration, and data transformation. This topic compares and analyzes the features and application scenarios of the preceding data processing methods to help you select an appropriate method based on your business requirements.
Background information
Logtail plug-in configuration: Logtail provides various configurations for data processing. You can use Logtail plug-ins and Simple Log Service Processing Language (SPL) to process data on clients. For more information, see Overview of Logtail plug-ins for data processing and Use Logtail SPL to parse logs.
Ingest processor: An ingest processor can be associated with a Logstore. By default, data written to the Logstore is processed by the ingest processor on the server.
Data transformation: Data is written to a source Logstore and then processed based on data transformation rules. The processed data is written to the destination Logstore.
Comparison
Ingest processors, Logtail plug-in configuration, and data transformation are the three key methods for processing data throughout its lifecycle. The methods are used at various stages: during data collection, before storage, during storage, and after storage. The methods process data in specific ways and support the SPL language. The methods have different features and are suitable for different scenarios.
Comparison dimension | Logtail plug-in configuration | Ingest processor | Data transformation |
Stage in data processing | Before storage (during data collection). | During storage. | After storage. |
Written to multiple Logstores | A single Logtail configuration is not supported. You can use multiple Logtail configurations together with Logtail plug-ins. | Not supported. | Supported. |
SPL | Supported. | Supported. | Supported. |
Supported SPL instructions | SPL instructions that handle one line of input at a time and return no output or one line of output are supported. | SPL instructions that handle one line of input at a time and return no output or one line of output are supported. | Complete SPL instructions are supported. |
No sensitive data written to disks | Supported. | Supported. | Not supported. Data is written to the source Logstore. |
Resource usage | Specific client resources are consumed. | Resources are automatically scaled. This process is transparent to users. | Resources are automatically scaled. This process is transparent to users. |
Performance | The collection performance is slightly affected based on the number of plug-ins and the complexity of the configurations. The write performance is not affected. | The write performance is slightly affected based on the complexity of data and SPL statements. The latency of a single request can increase by several milliseconds to tens of milliseconds based on the size of the requested data packet and the complexity of SPL statements. | The write performance of the source Logstore is not affected. |
Scenario coverage | Moderate. | Standard. | High. |
Cost | You are not charged data processing fees, but a certain amount of client resources are consumed. | You are charged data processing fees. In most cases, the cost associated with data processing in data filtering scenarios is lower than the cost savings from the reduced amount of data that needs to be transferred and stored. | You are charged source Logstore fees and data processing fees. You can set the retention period of data in the source Logstore to one day and disable indexing to reduce the cost of the source Logstore. |
Fault tolerance | You can specify whether to retain the original field when data fails to be processed. | You can specify whether to retain the original data when data fails to be processed. | The source data is stored. You can specify whether to reprocess data if the data fails to be transformed based on the specified data transformation rules. You can also create multiple data transformation jobs to separately process data. |
The following table provides a comparison of ingest processor, Logtail plug-in configuration, and data transformation in common scenarios.
Scenario | Logtail plug-in configuration | Ingest processor | Data transformation |
Simple data processing tasks, such as single-line data processing, which does not involve complex computational logic. | Recommended | Recommended | Recommended |
Complex data processing tasks, which involves complex computational logic or requires multiple conditions, window aggregation, and dimension table enrichment. | Moderately recommended | Moderately recommended | Recommended |
Limited client resources. For example, the computing resources that can be used by Logtail are limited. | Moderately recommended | Recommended | Recommended |
Limited permissions on clients. For example, you do not have the permissions to modify the Logtail configurations or the SDK write logic on the client from which data is collected. | Not recommended | Recommended | Recommended |
Limited permissions on servers. For example, you do not have the permissions to modify Logstores or data transformation configurations. | Recommended | Not recommended | Not recommended |
Sensitive to data write latency and performance. For example, you want raw data to be collected at the earliest opportunity. | Moderately recommended | Moderately recommended | Recommended |
Data masking and sensitive data written to disks. | Recommended | Recommended | Recommended |
Data masking and no sensitive data written to disks. | Recommended | Recommended | Not recommended |
Data enrichment that does not depend on external data sources. For example, when you add a new field, the value of the field is fixed or is extracted from an existing field. | Moderately recommended | Recommended | Recommended |
Data enrichment that depends on external data sources. For example, you can query additional enrichment data in a MySQL table based on log fields. | Not recommended | Not recommended | Recommended |
Data distribution. You can write data to different Logstores based on different conditions. | Moderately recommended | Not recommended | Recommended |
Data filtering. You need to filter data and do not need to store the raw data to save costs to some extent. | Moderately recommended | Recommended | Moderately recommended |