Data transformation is a fully managed feature that provides high availability and scalability in Log Service. You can use the data transformation feature to standardize, enrich, transfer, mask, and filter data.
Transformation process
- A consumer group reads data from a source Logstore.
- Log Service transforms each data entry based on a transformation rule.
- Log Service writes transformed data to a destination Logstore.
After the data is transformed, you can view the data in the destination Logstore.
Features
- Data standardization: You can extract fields from different formats of logs and convert the logs into structured data for stream processing and computing in data warehouses.
- Data enrichment: You can join the fields of logs and the fields of dimension tables to add dimensions for data analysis. For example, you can join order logs and user information tables to analyze data.
- Data transfer: You can transfer logs from regions outside the Chinese mainland to one region by using the global acceleration feature. This way, global logs can be managed in a centralized manner.
- Data masking: You can mask the sensitive information of data, such as passwords, mobile phone numbers, and addresses.
- Data filtering: You can filter the logs of specific services for further analysis.
Scenarios
- Data standardization: Log data is read from a source Logstore, transformed, and then
written to a destination Logstore.
- Data transfer: Log data is read from a source Logstore, transformed, and then written
to multiple destination Logstores.
- Multi-source data aggregation: Log data is read from multiple source Logstores, transformed,
and then written to a destination Logstore.
Transformation syntax
Log Service provides more than 200 built-in functions and more than 400 regular expressions. You can also use the domain-specific language (DSL) for Log Service to create user-defined functions (UDFs) based on your business requirements. For more information, see Syntax overview.
Benefits
- Allows you to use the DSL for Log Service to orchestrate functions as needed. You can use the orchestrated functions to filter, standardize, enrich, transfer, and mask data.
- Processes data in real time and allows you to view data within seconds. The feature scales in or out the computation capability based on the size of data and provides a high throughput.
- Applies to log analysis scenarios and provides out-of-the-box functions.
- Integrates with dashboards, exception logs, and alerts in real time.
- Offers a fully managed and maintenance-free service that can be integrated with the big data services of Alibaba Cloud and open source ecosystems.
Billing
- You are charged for the server and network resources that are consumed when you use the data transformation feature. For more information, see Billable items.
- To reduce costs, you can disable the indexing feature for a source Logstore and set a short data retention period. For more information, see Performance guide and Cost optimization guide.