All Products
Search
Document Center

Simple Log Service:Get started with data transformation

Last Updated:Nov 14, 2023

This topic describes a complete data transformation process to walk you through the feature and related operations. Website access logs are used as an example to describe the process.

Prerequisites

  • A project named web-project is created. For more information, see Create a project.

  • A Logstore named website_log is created in the web-project project, and the Logstore is used as the source Logstore. For more information, see Create a Logstore.

  • Website access logs are collected and stored in the website_log Logstore. For more information, see Data collection overview.

  • Destination Logstores are created in the web-project project. The following table describes the destination Logstores.

    Destination Logstore

    Description

    website-success

    Logs for successful access are stored in the website-success Logstore, which is configured in the target-success storage destination.

    website-fail

    Logs for failed access are stored in the website-fail Logstore, which is configured in the target-fail storage destination.

    website-etl

    Other access logs are stored in the website-etl Logstore, which is configured in the target0 storage destination.

  • If you use a Resource Access Management (RAM) user, make sure that the user is granted the permissions on data transformation. For more information, see Grant a RAM user the permissions to manage a data transformation job.

  • Indexes are configured for the source and destination Logstores. For more information, see Create indexes.

    Important

    Data transformation does not require indexes. However, if you do not configure indexes, you cannot perform query or analysis operations.

Background information

All access logs of a website are stored in a Logstore. You need to specify different topics for the logs to distinguish between logs for successful access and logs for failed access. In addition, you need to distribute the two types of logs to different Logstores for analysis. Sample log:

body_bytes_sent:1061
http_user_agent:Mozilla/5.0 (Windows; U; Windows NT 5.1; ru-RU) AppleWebKit/533.18.1 (KHTML, like Gecko) Version/5.0.2 Safari/533.18.5
remote_addr:192.0.2.2
remote_user:vd_yw
request_method:DELETE
request_uri:/request/path-1/file-5
status:207
time_local:10/Jun/2021:19:10:59

Step 1: Create a data transformation job

  1. Log on to the Simple Log Service console.

  2. Go to the data transformation page.

    1. In the Projects section, click the project that you want to manage.

    2. On the Log Storage > Logstores tab, click the Logstore that you want to manage.

    3. On the query and analysis page, click Data Transformation.

  3. In the upper-right corner of the page, select a time range for the required log data.

    Make sure that the Raw Logs tab displays log data.

  4. In the code editor, enter transformation statements.

    e_if(e_search("status:[200,299]"),e_compose(e_set("__topic__","access_success_log"),e_output(name="target-success")))
    e_if(e_search("status:[400,499]"),e_compose(e_set("__topic__","access_fail_log"),e_output(name="target-fail")))

    The e_if function indicates that the specified operations are performed if the condition is met. For more information, see e_if.

    • Condition: e_search("status:[200,299]")

      If the value of the status field meets the condition, the operations 1 and 2 are performed. For more information, see e_search.

    • Operation 1: e_set("__topic__","access_success_log")

      The function adds the __topic__ field and assigns the value access_success_log to the field. For more information, see e_set.

    • Operation 2: e_output(name="target-success", project="web-project", logstore="website-success")

      The function stores the transformed data in the website-success Logstore. For more information, see e_output.

  5. Preview transformation results.

    1. Select Quick.

      You can select Quick or Advanced. For more information, see Preview mode overview.

    2. Click Preview Data.

      View the transformation results.

      Important

      During the preview, logs are written to a Logstore named internal-etl-log instead of the destination Logstores. The first time that you preview transformation results, Simple Log Service automatically creates the internal-etl-log Logstore in the current project. This Logstore is dedicated. You cannot modify the configurations of this Logstore or write other data to this Logstore. You are not charged for this Logstore.

      预览结果
  6. Create a data transformation job.

    1. Click Save as Transformation Job.

    2. In the Create Data Transformation Job panel, configure the following parameters.

      Parameter

      Description

      Job Name

      The name of the data transformation job.

      Authorization Method

      The method used to authorize the data transformation job to read data from the source Logstore. Valid values:

      • Default Role: authorizes the data transformation job to assume the system role AliyunLogETLRole to read data from the source Logstore.

        You must click You must authorize the system role AliyunLogETLRole. Then, you must configure other parameters as prompted to complete the authorization. For more information, see Access data by using a default role.

        Note
        • If the authorization is complete within your Alibaba Cloud account, you can skip this operation.

        • If you use an Alibaba Cloud account that has assumed the role, you can skip this operation.

      • Custom Role: authorizes the data transformation job to assume a custom role to read data from the source Logstore.

        You must grant the custom role the permissions to read from the source Logstore. Then, you must enter the Alibaba Cloud Resource Name (ARN) of the custom role in the Role ARN field. For more information about authorization, see Access data by using a custom role.

      • AccessKey Pair: authorizes the data transformation job to use the AccessKey pair of an Alibaba Cloud account or a RAM user to read data from the source Logstore.

        • Alibaba Cloud account: The AccessKey pair of an Alibaba Cloud account has permissions to read from the source Logstore. You can directly enter the AccessKey ID and AccessKey secret of the Alibaba Cloud account in the AccessKey ID and AccessKey Secret fields. For more information about how to obtain an AccessKey pair, see AccessKey pair.

        • RAM user: You must grant the RAM user the permissions to read from the source Logstore. Then, you can enter the AccessKey ID and AccessKey secret of the RAM user in the AccessKey ID and AccessKey Secret fields. For more information about authorization, see Access data by using AccessKey pairs.

      Storage Target

      Target Name

      The name of the storage destination. Storage Target includes Target Project and Target Store.

      Make sure that the value of this parameter is the same as the value of name configured in Step 4.

      Note

      By default, Simple Log Service uses the storage destination that is numbered 1 to store the logs that do not meet the specified conditions. In this example, the target0 storage destination is used.

      Target Region

      The region of the project to which the destination Logstore belongs.

      If you want to perform data transformation across regions, we recommend that you use HTTPS for data transmission. This ensures the privacy of log data.

      For cross-region data transformation, the data is transmitted over the Internet. If the Internet connections are unstable, data transformation latency may exist. You can select DCDN Acceleration to accelerate the cross-region data transmission. Before you can select DCDN Acceleration, make sure that the global acceleration feature is enabled for the project. For more information, see Enable the global acceleration feature.

      Important

      If data is pulled over a public Simple Log Service endpoint, you are charged for read traffic over the Internet. The traffic is calculated based on the size of data after compression. For more information, see Billable items of pay-by-feature.

      Target Project

      The name of the project to which the destination Logstore belongs.

      Target Store

      The name of the destination Logstore.

      Authorization Method

      The method used to authorize the data transformation job to write transformed data to the destination Logstore. Valid values:

      • Default Role: authorizes the data transformation job to assume the system role AliyunLogETLRole to write transformed data to the destination Logstore.

        You must click You must authorize the system role AliyunLogETLRole. Then, you must configure other parameters as prompted to complete the authorization. For more information, see Access data by using a default role.

        Note
        • If you use a RAM user, you must use an Alibaba Cloud account to assign the AliyunLogETLRole role to the user.

        • If you use an Alibaba Cloud account that has assumed the role, you can skip this operation.

      • Custom Role: authorizes the data transformation job to assume a custom role to write transformed data to the destination Logstore.

        You must grant the custom role the permissions to write to the destination Logstore. Then, you must enter the ARN of the custom role in the Role ARN field. For more information about authorization, see Access data by using a custom role.

      • AccessKey Pair: authorizes the data transformation job to use the AccessKey pair of an Alibaba Cloud account or a RAM user to write transformed data to the destination Logstore.

        • Alibaba Cloud account: The AccessKey pair of an Alibaba Cloud account has permissions to write to the destination Logstore. You can directly enter the AccessKey ID and AccessKey secret of the Alibaba Cloud account in the AccessKey ID and AccessKey Secret fields. For more information about how to obtain an AccessKey pair, see AccessKey pair.

        • RAM user: You must grant the RAM user the permissions to write to the destination Logstore. Then, you can enter the AccessKey ID and AccessKey secret of the RAM user in the AccessKey ID and AccessKey Secret fields. For more information about authorization, see Access data by using AccessKey pairs.

      Processing Range

      Time Range

      The time range within which the data is transformed. Valid values:

      Note

      The value of Time Range is based on the time when logs are received.

      • All: transforms data in the source Logstore from the first log until the job is manually stopped.

      • From Specific Time: transforms data in the source Logstore from the log that is received at the specified start time until the job is manually stopped.

      • Within Specific Period: transforms data in the source Logstore from the log that is received at the specified start time to the log that is received at the specified end time.

    3. Click OK.

After logs are distributed to the destination Logstores, you can perform query and analysis operations in the destination Logstores. For more information, see Query and analyze logs.

Step 2: View the data transformation job

  1. In the left-side navigation pane, choose Job Management > Data Transformation.

  2. In the list of data transformation jobs, find and click the job that you created.

  3. On the Data Transformation Overview page, view the details of the job.

    You can view the details and status of the job. You can also modify, start, stop, or delete the job. For more information, see Manage a data transformation job.

    加工任务