This topic describes the data transformation process by using Server Load Balancer (SLB) logs as an example.
Prerequisites
- SLB logs are uploaded to a Logstore. In this example, the Logstore is named slb-log.
- Destination Logstores are created. In this example, the following two destination
Logstores are created:
- One of the Logstores is named 200_log. Logs are retained in this Logstore for 90 days. The Target Name parameter is set to 200-target.
- The other Logstore is named 400_log. Logs are retained in this Logstore for 180 days. The Target Name parameter is set to 400-target.
Configure the source and destination Logstores based on your business requirements. For more information, see Performance guide and Cost optimization guide.
- The indexing feature is enabled for the source and destination Logstores. For more information, see Enable and configure the index feature for a Logstore.
Background information
- Data transformation requirements
- Different topics for log entries are set based on the value of the status field and these log entries are distributed to different Logstores.
- The key-value pairs in the http_host field are split.
- Sample log entries
""" Log entry 1 """ __source__: log_service __tag__:__receive_time__: 1559799897 __topic__: body_bytes_sent: 740 client_ip: 1.2.3.4 host: m.abcd.com http_host: m.abcd.com/s? q=asd&a=1&b=2 status: 200 """ Log entry 2 """ __source__: log_service __tag__:__receive_time__: 1559799897 __topic__: body_bytes_sent: 740 client_ip: 1.2.3.4 host: m.abcd.com http_host: m.abcd.com/s? q=asd&a=1&b=2 status: 400 ...
Create a data transformation task
Use functions
- Basic usage of functions
- Use the e_set function.
The e_set function updates log fields. For more information, see e_set.
- The following example shows how to add a new field named province to a log entry:
e_set("province","Shanghai")
When you preview the log entry, the province field is displayed.__source__: log_service __tag__:__receive_time__: 1559799897 __topic__: body_bytes_sent: 740 client_ip: 1.2.3.4 host: m.abcd.com http_host: m.abcd.com/s? q=asd&a=1&b=2 status: 200 province: Shanghai
- The following example shows how to copy the value of the status field in a log entry and assign the value to a new field. In this example, the v
function is used to copy the value of the status field and assign the value to a new field named result.
e_set("result",v("status"))
When you preview the log entry, the result field is displayed and its value is the same as that of the status field.__source__: log_service __tag__:__receive_time__: 1559799897 __topic__: body_bytes_sent: 740 client_ip: 1.2.3.4 host: m.abcd.com http_host: m.abcd.com/s? q=asd&a=1&b=2 status: 200 result: 200
- The following example shows how to add a new field named province to a log entry:
- Use the e_rename function.
The e_rename function renames a field. For more information, see e_rename.
- The following example shows how to rename the client_ip field to the ip field:
e_rename("client_ip","ip")
When you preview the log entry, the ip field is displayed.__source__: log_service __tag__:__receive_time__: 1559799897 __topic__: body_bytes_sent: 740 ip: 1.2.3.4 host: m.abcd.com http_host: m.abcd.com/s? q=asd&a=1&b=2 status: 200
- The following example shows how to rename the client_ip field to the ip field and rename the body_bytes_sent field to the bytes_content field:
e_rename("client_ip","ip","body_bytes_sent","bytes_content")
When you preview the log entry, the ip field and the bytes_content field are displayed.__source__: log_service __tag__:__receive_time__: 1559799897 __topic__: bytes_content: 740 ip: 1.2.3.4 host: m.abcd.com http_host: m.abcd.com/s? q=asd&a=1&b=2 status: 200
- The following example shows how to rename the client_ip field to the ip field:
- Use the e_keep_fields and e_drop_fields functions.
The e_keep_fields function retains the fields that meet a specified condition. For more information, see e_keep_fields.
The e_drop_fields function deletes the fields that meet a specified condition. For more information, see e_drop_fields.
- The following example shows how to use the e_keep_fields function to retain the status field:
Note The __source__ field or the __topic__ field cannot be deleted.
e_keep_fields("status")
When you preview the log entry, the status field is retained.__source__: __topic__: status: 200
- The following example shows how to use the e_drop_fields function to delete the http_host field:
e_drop_fields("http_host")
When you preview the log entry, the http_host field is deleted.__source__: log_service __tag__:__receive_time__: 1559799897 __topic__: bytes_content: 740 ip: 1.2.3.4 host: m.abcd.com status: 200
- The following example shows how to use the e_keep_fields function to retain the status field:
- Use the e_set function.
- Intermediate usage of functions
- Use the e_if and e_search functions.
The condition and operation parameters of the e_if function must be specified in pairs. The e_if function performs an operation on log fields based on the corresponding condition. If a log field meets the condition, the operation is performed. Otherwise, the operation is not performed and the function continues to check the next condition. If an operation is performed to delete a log entry, no subsequent operations are performed on the log entry.
The e_search function uses a query string to check whether a field value in a log entry meets a specified condition. True or False is returned.
If a field value meets the condition, the corresponding operation is performed. The following functions check whether the value of the status field is 200. If so, the value of the topic field is set to login_success_event.e_if(e_search("status==200"), e_set("__topic__", "login_success_event"))
When you preview the log entry, the value of the __topic__ field is set to login_success_event.__source__: log_service __tag__:__receive_time__: 1559799897 __topic__: login_success_event body_bytes_sent: 740 client_ip: 1.2.3.4 host: m.abcd.com http_host: m.abcd.com/s? q=asd&a=1&b=2 status: 200
- Use the e_kv function.
The e_kv function extracts key-value pairs from multiple source fields by using parameters such as quote. For more information, see e_kv.
The following example shows how to use the e_kv function to extract key-value pairs from the http_host field:e_kv("http_host")
When you preview the log entry, the key-value pairs are extracted from the http_host field.__source__: log_service __tag__:__receive_time__: 1559799897 __topic__: body_bytes_sent: 740 client_ip: 1.2.3.4 host: m.abcd.com http_host: m.abcd.com/s? q=asd&a=1&b=2 status: 200 q: asd a: 1 b: 2
In addition, the e_regex, e_kv_delimit, and e_json functions also extract log fields. For more information, see Value extraction functions.
- Use the e_if and e_search functions.
- Advanced usage of functions
Integrate the e_if, e_search, e_drop, and e_keep functions.
The following example shows how to retain the log entries whose status field value is 200 and drop the log entries whose status field value is 400:
- Raw log entries:
""" Log entry 1 """ __source__: log_service __tag__:__receive_time__: 1559799897 __topic__: body_bytes_sent: 740 client_ip: 1.2.3.4 host: m.abcd.com http_host: m.abcd.com/s? q=asd&a=1&b=2 status: 200 """ Log entry 2 """ __source__: log_service __tag__:__receive_time__: 1559799897 __topic__: body_bytes_sent: 740 client_ip: 1.2.3.4 host: m.abcd.com http_host: m.abcd.com/s? q=asd&a=1&b=2 status: 400
- Transformation rule:
e_if(e_search("status==400"),e_drop()) e_if(e_search("status==200"),e_keep())
- Transformation result:
""" Log entry 1 """ __source__: log_service __tag__:__receive_time__: 1559799897 __topic__: body_bytes_sent: 740 client_ip: 1.2.3.4 host: m.abcd.com http_host: m.abcd.com/s? q=asd&a=1&b=2 status: 200 """ Log entry 2 is dropped. """
- Raw log entries: