This topic uses Alibaba Cloud Server Load Balancer (SLB) logs as an example to describe the data transformation procedure.

Prerequisites

Background information

The data in the source Logstore named slb-log is the network logs of SLB. The following example shows the data transformation requirements and sample log entries:
  • Data transformation requirements
    • Set different topics for logs based on the value of the status field in the logs and distribute the logs to different Logstores.
    • Split the key-value pairs in the http_host field.
  • Sample log entries
    """
    Log entry 1
    """
    __source__:  log_service
    __tag__:__receive_time__:  1559799897
    __topic__:  
    body_bytes_sent:  740
    client_ip:  1.2.3.4
    host:  m.abcd.com
    http_host:  m.abcd.com/s? q=asd&a=1&b=2
    status:  200
    
    """
    Log entry 2
    """
    __source__:  log_service
    __tag__:__receive_time__:  1559799897
    __topic__:  
    body_bytes_sent:  740
    client_ip:  1.2.3.4
    host:  m.abcd.com
    http_host:  m.abcd.com/s? q=asd&a=1&b=2
    status:  400
    
    ...

Create a data transformation task

  1. Log on to the Log Service console.
  2. In the Projects section, click the target project.
  3. Go to the data transformation page.
    1. Choose Log Management > Logstores. On the Logstores tab, find the Logstore named slb-log, and click the > icon of the Logstore. Choose Data Transformation > Data Transformation.
    2. Click the plus sign (+) next to Data Transformation. The data transformation page appears.
  4. Select the time range for received logs on the page.
    Make sure that data is available on the data transformation page.
  5. Create a transformation rule.
    1. Enter the following rule in the text box:
      e_kv("http_host")
      e_if(e_search("status==200"),e_compose(e_set("__topic__","login_success_event"),e_output(name="200-target", project="project1", logstore="200_log")))
      e_if(e_search("status==400"),e_compose(e_set("__topic__","login_fail_event"),e_output(name="400-target", project="project1", logstore="400_log")))
      Rule description
      • The e_kv function extracts key-value pairs from the http_host field. For example, key-value pairs are extracted from the field value m.abcd.com/s? q=asd&a=1&b=2 in the following format:
        http_host: m.abcd.com/s? q=asd&a=1&b=2
        q: asd
        a: 1
        b: 2
      • The e_if (condition, operation) function performs an operation when the specified condition is met.
        • The e_search("status==200") function checks whether the value of the status field is 200.
        • The e_compose(operation1,operation2) function performs the specified two operations in sequence.

        The e_set function is called first to set the __topic__ field to login_success_event. Then, the e_output function is called to export logs to the specified destination Logstores. The function also specifies the topics, sources, and tags of the exported logs. In a preview, the transformed data is not exported to the destination Logstores, but to a temporary Logstore. At the same time, the information about the destination Logstores is displayed.

      For more information about data transformation functions, see Data transformation syntax.

    2. Click Preview Data.
    3. On the Add Preview Configuration page, enter the AccessKey pair of the Alibaba Cloud account or RAM user that is authorized to access the source Logstore.
    4. View the transformation result on the data transformation page.
      • All log entries whose status field value is 200 are exported to the 200_log Logstore, and the topic field in the log entries is set to login_success_event.
      • All log entries whose status field value is 400 are exported to the 400_log Logstore, and the topic field in the log entries is set to login_fail_event.
      Result:
  6. Save the transformation rule.
    1. Click Save as Transformation Rule.
    2. In the Create Data Transformation Rule dialog box, set the parameters. The following table describes the parameters.
      Save transformation rule configurations
      • Set Target Name to the value of the name parameter in the e_output function. Set Target Project to the value of the project in the e_output function. Set Target Logstore to the value of the Logstore parameter in the e_output function. For more information about other parameters, see Create a data transformation task.
      • The e_output function writes transformed data to the destination Logstores.
      • You can include multiple operations in the e_compose function. For more information, see e_compose.
  7. View the data transformation task.
    1. In the left-side navigation pane, click Data Transformation - 001to enter the Data Transformation page.
    2. Click the target data transformation task.
    3. On the Data Transformation Overview page, perform the following operations on the task based on your business requirements.
      • View, modify, stop, restart, or delete the task.Manage tasks
      • View the dashboard of data transformation tasks. For more information, see Data transformation dashboard. View transformation rules

Functions

  • Basic usage of functions
    • Use the e_set function.
      The e_set function updates log fields. For more information, see e_set.
      • The following example shows how to add a new field named province to a log entry:
        e_set("province","Shanghai")
        The province field is included in the log entry, as shown in the following preview:
        __source__:  log_service
        __tag__:__receive_time__:  1559799897
        __topic__:  
        body_bytes_sent:  740
        client_ip:  1.2.3.4
        host:  m.abcd.com
        http_host:  m.abcd.com/s? q=asd&a=1&b=2
        status:  200
        province: Shanghai
      • The following example shows how to copy the value of the status field in a log entry and assign the value to a new field. The v function is used to copy the value of the status field and assign the value to a new field named result:
        e_set("result",v("status"))
        The result field is included in the log entry and its value is the same as that of the status field, as shown in the following preview:
        __source__:  log_service
        __tag__:__receive_time__:  1559799897
        __topic__:  
        body_bytes_sent:  740
        client_ip:  1.2.3.4
        host:  m.abcd.com
        http_host:  m.abcd.com/s? q=asd&a=1&b=2
        status:  200
        result: 200
    • Use the e_rename function.
      The e_rename function renames a field. For more information, see e_rename.
      • The following example shows how to rename the client_ip field to ip:
        e_rename("client_ip","ip")
        The client_ip field is renamed to ip, as shown in the following preview:
        __source__:  log_service
        __tag__:__receive_time__:  1559799897
        __topic__:  
        body_bytes_sent:  740
        ip:  1.2.3.4
        host:  m.abcd.com
        http_host:  m.abcd.com/s? q=asd&a=1&b=2
        status:  200
      • The following example shows how to rename the client_ip field to ip and rename the body_bytes_sent field to bytes_content:
        e_rename("client_ip","ip","body_bytes_sent","bytes_content")
        The client_ip field is renamed to ip and the body_bytes_sent field is renamed to bytes_content, as shown in the following preview:
        __source__:  log_service
        __tag__:__receive_time__:  1559799897
        __topic__:  
        bytes_content:  740
        ip:  1.2.3.4
        host:  m.abcd.com
        http_host:  m.abcd.com/s? q=asd&a=1&b=2
        status:  200
    • Use e_keep_fields and e_drop_fields functions.

      The e_keep_fields function retains the fields that meet the specified conditions. For more information, see e_keep_fields.

      The e_drop_fields function deletes the fields that meet the specified conditions. For more information, see e_drop_fields.

      • The following example shows how to use the e_keep_fields function to retain the status field:
        Note You cannot drop the __source__ field or __topic__ field.
        e_keep_fields("status")
        The status field is retained in the log entry, as shown in the following preview:
        __source__:  
        __topic__:  
        status:  200
      • The following example shows how to use the e_drop_field function to drop the http_host field:
        e_drop_fields("http_host")
        The http_host field is dropped from the log entry, as shown in the following preview:
        __source__:  log_service
        __tag__:__receive_time__:  1559799897
        __topic__:  
        bytes_content:  740
        ip:  1.2.3.4
        host:  m.abcd.com
        status:  200
  • Intermediate usage of functions
    • Use the e_if and e_search function.

      The e_if function performs operations on log fields in sequence based on the specified conditions. If a field meets a specified condition, the corresponding operation is performed. Otherwise, the corresponding operation is not performed, and the function continues to check the next condition. If an operation is performed to delete a log entry, no subsequent operations are performed on the log entry.

      The e_search function uses a query string to check whether a field value in a log entry meets a specified condition. True or False is returned.

      If the field value meets the specified condition, the corresponding operation is performed. The following function checks whether the value of the status field is 200. If so, the topic field is set to login_success_event:
      e_if(e_search("status==200"), e_set("__topic__", "login_success_event"))
      The __topic__ field is set to login_success_event, as shown in the following preview:
      __source__:  log_service
      __tag__:__receive_time__:  1559799897
      __topic__:  login_success_event
      body_bytes_sent:  740
      client_ip:  1.2.3.4
      host:  m.abcd.com
      http_host:  m.abcd.com/s? q=asd&a=1&b=2
      status:  200
    • Use the e_kv function.

      The e_kv function extracts key-value pairs from multiple source fields by using parameters such as quote. For more information, see e_kv.

      The following example shows how to use the e_kv function to extract key-value pairs from the http_host field:
      e_kv("http_host")
      Key-values are extracted from the http_host field, as shown in the following preview:
      __source__:  log_service
      __tag__:__receive_time__:  1559799897
      __topic__:  
      body_bytes_sent:  740
      client_ip:  1.2.3.4
      host:  m.abcd.com
      http_host:  m.abcd.com/s? q=asd&a=1&b=2
      status:  200
      q: asd
      a: 1
      b: 2

      In addition to the e_kv function, the e_regex, e_kv_delimit, and e_json functions also extract log fields. For more information, see Value extraction functions.

  • Advanced usage of functions

    Integrate the e_if, e_search, e_drop, and e_keep functions.

    The following example shows how to retain the log entries whose status value is 200 and drop the log entries whose status value is 400:

    • Raw log entries
      """
      Log entry 1
      """
      __source__:  log_service
      __tag__:__receive_time__:  1559799897
      __topic__:  
      body_bytes_sent:  740
      client_ip:  1.2.3.4
      host:  m.abcd.com
      http_host:  m.abcd.com/s? q=asd&a=1&b=2
      status:  200
      
      
      """
      Log entry 2
      """
      __source__:  log_service
      __tag__:__receive_time__:  1559799897
      __topic__:  
      body_bytes_sent:  740
      client_ip:  1.2.3.4
      host:  m.abcd.com
      http_host:  m.abcd.com/s? q=asd&a=1&b=2
      status:  400
    • Transformation rule:
      e_if(e_search("status==400"),e_drop())
      e_if(e_search("status==200"),e_keep())
    • Result:
      """
      Log entry 1
      """
      __source__:  log_service
      __tag__:__receive_time__:  1559799897
      __topic__:  
      body_bytes_sent:  740
      client_ip:  1.2.3.4
      host:  m.abcd.com
      http_host:  m.abcd.com/s? q=asd&a=1&b=2
      status:  200
      
      """
      The second log entry is dropped.
      """