This topic describes a complete data transformation process to walk you through the feature and related operations. Website access logs are used as an example to describe the process.

Prerequisites

  • A project named web-project is created. For more information, see Create a project.
  • A Logstore named website_log is created in the web-project project, and the Logstore is used as the source Logstore. For more information, see Create a Logstore.
  • Website access logs are collected and stored in the website_log Logstore. For more information, see Data collection methods.
  • Destination Logstores are created in the web-project project. The following table lists the details about the destination Logstores.
    Destination Logstore Description
    website-success Logs for successful access are stored in the website-success Logstore, which is configured in the target-success storage destination.
    website-fail Logs for failed access are stored in the website-fail Logstore, which is configured in the target-fail storage destination.
    website-etl Other access logs are stored in the website-etl Logstore, which is configured in the target0 storage destination.
  • If you use a Resource Access Management (RAM) user, the user is granted the permissions to perform related operations for data transformation. For more information, see Authorize a RAM user to manage a data transformation task.
  • Indexes are configured for the source and destination Logstores. For more information, see Configure indexes.

    Data transformation does not require indexes. However, if you do not configure indexes, you cannot perform query or analysis operations.

Background information

All access logs of a website are stored in a Logstore. You need to specify different topics for the logs to distinguish between logs for successful access and logs for failed access. In addition, you need to distribute the two types of logs to different Logstores for analysis. Log sample:
body_bytes_sent:1061
http_user_agent:Mozilla/5.0 (Windows; U; Windows NT 5.1; ru-RU) AppleWebKit/533.18.1 (KHTML, like Gecko) Version/5.0.2 Safari/533.18.5
remote_addr:192.0.2.2
remote_user:vd_yw
request_method:DELETE
request_uri:/request/path-1/file-5
status:207
time_local:10/Jun/2021:19:10:59

Step 1: Create a data transformation task

  1. Log on to the Log Service console.
  2. Go to the data transformation page.
    1. In the Projects section, click the name of the project that you want to view.
    2. Choose Log Storage > Logstores. On the Logstores tab, click the Logstore that you want to view.
    3. On the query and analysis page, click Data Transformation.
  3. In the upper-right corner of the page, select a time range for the required log data.
    Make sure that the Raw Logs tab displays log data.
  4. In the editor, enter transformation statements.
    e_if(e_search("status:[200,299]"),e_compose(e_set("__topic__","access_success_log"),e_output(name="target-success")))
    e_if(e_search("status:[400,499]"),e_compose(e_set("__topic__","access_fail_log"),e_output(name="target-fail")))
    The e_if function indicates that the specified operations are performed if the condition is met. For more information, see e_if.
    • Condition: e_search("status:[200,299]")

      If the value of the status field meets the condition, the operations 1 and 2 are performed. For more information, see e_search.

    • Operation 1: e_set("__topic__","access_success_log")

      The function adds the __topic__ field and assigns the value access_success_log to the field. For more information, see e_set.

    • Operation 2: e_output(name="target-success", project="web-project", logstore="website-success")

      The function stores the transformed data in the website-success Logstore. For more information, see Event processing functions.

  5. Preview transformation results.
    1. Select Quick.
      You can select either Quick or Advanced. For more information, see Configure preview modes.
    2. Click Preview Data.
      View the results.
      Note During the preview, logs are written to a Logstore named internal-etl-log instead of the destination Logstores. If this is your first time to preview transformation results, Log Service automatically creates the internal-etl-log Logstore in the current project. This Logstore is dedicated. You cannot modify the configurations of this Logstore or write other data to this Logstore. This Logstore is not charged.
      Preview transformation results
  6. Create a data transformation task.
    1. Click Save as Transformation Rule.
    2. In the Create Data Transformation Rule panel, configure the following parameters.
      Save the settings of the transformation rule
      Parameter Description
      Rule Name The name of the transformation rule.
      Authorization Method The method used to authorize the data transformation task to read data from the source Logstore. Valid values:
      • Default Role: authorizes the data transformation task to assume the system role AliyunLogETLRole to read data from the source Logstore.

        You must click You must authorize the system role AliyunLogETLRole. Then, you must configure other parameters as prompted to complete the authorization. For more information, see Authorize Log Service to assume a system role.

        Note
        • If you use a RAM user, you must use an Alibaba Cloud account to assign the AliyunLogETLRole role to the user.
        • If you use an Alibaba Cloud account that has assumed the role, you can skip this operation.
      • Custom Role: authorizes the data transformation task to assume a custom role to read data from the source Logstore.

        You must grant the custom role the permissions to read from the source Logstore. Then, you must enter the Alibaba Cloud Resource Name (ARN) of the custom role in the Role ARN field. For more information about authorization, see Authorize Log Service to assume a custom role.

      • AccessKey Pair: authorizes the data transformation task to use the AccessKey pair of an Alibaba Cloud account or a RAM user to read data from the source Logstore.
        • Alibaba Cloud account: The AccessKey pair of an Alibaba Cloud account has permissions to read from the source Logstore. You can directly enter the AccessKey ID and AccessKey secret of the Alibaba Cloud account in the AccessKey ID and AccessKey Secret fields. For more information about how to obtain an AccessKey pair, see AccessKey pair.
        • RAM user: You must grant the RAM user the permissions to read from the source Logstore. Then, you can enter the AccessKey ID and AccessKey secret of the RAM user in the AccessKey ID and AccessKey Secret fields. For more information about authorization, see Configure an AccessKey pair for a RAM user to access a source Logstore and a destination Logstore.
      Storage Target
      Target Name The name of the storage destination. Storage Target includes Target Project and Target Logstore.
      Make sure that the value of this parameter is the same as the value of name configured in 4.
      Note By default, Log Service uses the storage destination that is numbered 1 to store the logs that do not meet the specified conditions. In this example, set the value to target0.
      Target Region The region of the project to which the destination Logstore belongs.

      If you want to perform data transformation across regions, we recommend that you use HTTPS for data transmission. This ensures the privacy of log data.

      For cross-region data transformation, the data is transmitted over the Internet. If the Internet connections are unstable, data transformation latency may occur. You can select DCDN Acceleration to accelerate the cross-region data transmission. Before you can select DCDN Acceleration, make sure that the global acceleration feature is enabled for the project. For more information, see Enable the global acceleration feature.

      Note You are charged for the amount of Internet traffic that is generated when data after compression is transmitted across regions. For more information, see Billable items.
      Target Project The name of the project to which the destination Logstore belongs.
      Target Logstore The name of the destination Logstore.
      Authorization Method The method used to authorize the data transformation task to write transformed data to the destination Logstore. Valid values:
      • Default Role: authorizes the data transformation task to assume the system role AliyunLogETLRole to write transformed data to the destination Logstore.
        You must click You must authorize the system role AliyunLogETLRole. Then, you must configure other parameters as prompted to complete the authorization. For more information, see Authorize Log Service to assume a system role.
        Note
        • If you use a RAM user, you must use an Alibaba Cloud account to assign the AliyunLogETLRole role to the user.
        • If you use an Alibaba Cloud account that has assumed the role, you can skip this operation.
      • Custom Role: authorizes the data transformation task to assume a custom role to write transformed data to the destination Logstore.

        You must grant the custom role the permissions to write to the destination Logstore. Then, you must enter the ARN of the custom role in the Role ARN field. For more information about authorization, see Authorize Log Service to assume a custom role.

      • AccessKey Pair: authorizes the data transformation task to use the AccessKey pair of an Alibaba Cloud account or a RAM user to write transformed data to the destination Logstore.
        • Alibaba Cloud account: The AccessKey pair of an Alibaba Cloud account has permissions to write to the destination Logstore. You can directly enter the AccessKey ID and AccessKey secret of the Alibaba Cloud account in the AccessKey ID and AccessKey Secret fields. For more information about how to obtain an AccessKey pair, see AccessKey pair.
        • RAM user: You must grant the RAM user the permissions to write to the destination Logstore. Then, you can enter the AccessKey ID and AccessKey secret of the RAM user in the AccessKey ID and AccessKey Secret fields. For more information about authorization, see Configure an AccessKey pair for a RAM user to access a source Logstore and a destination Logstore.
      Processing Range
      Time Range The time range within which the data is transformed. Valid values:
      Note The value of Time Range is based on the time when logs are received.
      • All: transforms data in the source Logstore from the first log entry until the task is manually stopped.
      • From Specific Time: transforms data in the source Logstore from the log entry that is received at the specified start time until the task is manually stopped.
      • Within Specific Period: transforms data in the source Logstore from the log entry that is received at the specified start time to the log entry that is received at the specified end time.
    3. Click OK.

After logs are distributed to the destination Logstores, you can perform query and analysis operations on the destination Logstores. For more information, see Query logs.

Step 2: View the data transformation task

  1. In the left-side navigation pane, choose Jobs > Data Transformation.
  2. In the data transformation task list, find and click the task.
  3. On the Data Transformation Overview page, view the details of the task.

    You can view the details and status of the task. You can also modify, start, stop, or delete the task. For more information, see Manage a data transformation task.

    Data transformation task