You can use Log Service to collect log data and ship the log data to Object Storage Service (OSS) for storage and analysis. This topic describes how to ship log data from Log Service to OSS.

Prerequisites

Background information

Log Service can automatically ship log data from a Logstore to an OSS bucket.
  • You can specify a custom retention period for the log data in the OSS bucket. You can also configure the OSS bucket to store the log data for a long period.
  • You can use data processing platforms, such as E-MapReduce and Data Lake Analytics (DLA), or use custom programs to consume log data from the OSS bucket.

Ship log data

  1. Log on to the Log Service console.
  2. In the Projects section, click the name of the project that you want to view.
  3. Choose Log Storage > Logstores. On the Logstores tab, click the > icon of the Logstore. Then, choose Data Transformation > Export > Object Storage Service (OSS).
  4. On the OSS Shipper page, click Enable.
  5. In the Shipping Notes dialog box, click Ship.
  6. In the OSS LogShipper panel, configure a shipping rule to ship log data to the OSS bucket. Click OK.
    The following table describes the parameters.
    Parameter Description
    OSS Shipper Name The name of the shipping rule.

    The name must be 2 to 128 characters in length, and can contain only lowercase letters, digits, hyphens (-), and underscores (_). The name must start and end with a lowercase letter or a digit.

    OSS Bucket The name of the OSS bucket to which you want to ship log data.
    Notice You must specify the name of an existing OSS bucket. The OSS bucket that you specify must reside in the same region as the Log Service project.
    OSS Prefix The directory to which log data is shipped in the OSS bucket.
    Shard Format The partition format of the directory for a shipping task. The directory is generated based on the time when the shipping task is created. The default format is %Y/%m/%d/%H/%M. The value of this parameter cannot start with a forward slash (/). For information about partition format examples, see Partition format. For information about the parameters of partition formats, see strptime API.
    RAM Role The Alibaba Cloud Resource Name (ARN) of the RAM role. The RAM role is created by the OSS bucket owner for access control. Example: acs:ram::45643:role/aliyunlogdefaultrole. For more information, see How do I obtain the ARN of a RAM role?
    Shipping Size The maximum size of log data that a shard can ship. This parameter is used to specify the maximum size of raw log data that can be shipped to the OSS bucket in a shipping task. Valid values: 5 to 256. Unit: MB.

    If the size of data that needs to be shipped exceeds the specified value, another shipping task is automatically created.

    Storage Format The storage format of log data in OSS. After you ship data to OSS, the data can be stored in the JSON, CSV, or Parquet format. For more information, see CSV format.
    Compress Specifies whether to compress log data that is shipped to OSS. Valid values:
    • No Compress: Log data that is shipped to OSS is not compressed.
    • Compress (snappy): The Snappy tool is used to compress the log data that is shipped to OSS. If you select Compress (snappy), less storage space is occupied in the OSS bucket.
    Ship Tags Specifies whether to ship log tags.
    Shipping Time The maximum period for which each shard can ship data. Valid values: 300 to 900. Default value: 300. Unit: seconds.

    If the period for which each shard ships log data exceeds the specified value, another shipping task is automatically created.

    Notice
    • After you configure a shipping rule, multiple shipping tasks can concurrently run. If the size of data that needs to be shipped from a shard reaches the specified size or the specified period ends, another shipping task is created.
    • After you create a shipping task, you can check whether the shipping rule meets your business requirements based on the task status and the data that is shipped to OSS.

View OSS data

After log data is shipped to OSS, you can view the log data in the OSS console. You can also view the log data by using other methods, such as the OSS API or SDK. For more information, see Manage objects.

The following script shows a sample OSS directory:
oss://OSS-BUCKET/OSS-PREFIX/PARTITION-FORMAT_RANDOM-ID
OSS-BUCKET is the name of the OSS bucket. OSS-PREFIX is the name prefix of the directory in the OSS bucket. PARTITION-FORMAT is the partition format of the directory for a shipping task. The partition format is generated based on the time when the shipping task is created. For more information, see strptime API. RANDOM-ID is the unique identifier of a shipping task.
Note A directory in an OSS bucket is created based on the time when a shipping task is created. For example, a shipping task was created at 00:00:00 on June 23, 2016. The data that was written to Log Service after 23:55:00 on June 22, 2016 is shipped to OSS at an interval of 5 minutes. To retrieve all logs that were shipped on June 22, 2016, you must check all the objects in the 2016/06/22 directory. You must also check whether the 2016/06/23/00/ directory includes the objects that are generated in the first 10 minutes of June 23, 2016.

Partition format

For each shipping task, log data is written to a directory of an OSS bucket. The directory is in the oss://OSS-BUCKET/OSS-PREFIX/PARTITION-FORMAT_RANDOM-ID format. A partition format is generated by formatting the time when a shipping task is created. The following table describes the partition formats and directories that are generated when a shipping task was created at 19:50:43 on January 20, 2017.
OSS Bucket OSS Prefix Partition format OSS directory
test-bucket test-table %Y/%m/%d/%H/%M oss://test-bucket/test-table/2017/01/20/19/50_1484913043351525351_2850008
test-bucket log_ship_oss_example year=%Y/mon=%m/day=%d/log_%H%M%s oss://test-bucket/log_ship_oss_example/year=2017/mon=01/day=20/log_195043_1484913043351525351_2850008.parquet
test-bucket log_ship_oss_example ds=%Y%m%d/%H oss://test-bucket/log_ship_oss_example/ds=20170120/19_1484913043351525351_2850008.snappy
test-bucket log_ship_oss_example %Y%m%d/ oss://test-bucket/log_ship_oss_example/20170120/_1484913043351525351_2850008
Note If you use this format, platforms such as Hive may fail to parse log data in the OSS bucket. We recommend that you do not use this format.
test-bucket log_ship_oss_example %Y%m%d%H oss://test-bucket/log_ship_oss_example/2017012019_1484913043351525351_2850008

You can use big data platforms such as Hive, MaxCompute, or DLA to analyze OSS data. In this case, you can set PARTITION-FORMAT in the key=value format if you want to use partition information. For example, you can set the partition format to oss://test-bucket/log_ship_oss_example/year=2017/mon=01/day=20/log_195043_1484913043351525351_2850008.parquet. In this example, the year, mon, and day columns are specified as partition keys.

Related operations

After shipping tasks are created based on a shipping rule, you can modify the shipping rule on the OSS Shipper page. You can also disable the data shipping feature, view the statuses and error messages of the tasks, and retry failed tasks.
  • Modify the shipping rule.

    Click Settings to modify the shipping rule. For information about parameters, see Ship log data.

  • Disable the data shipping feature.

    Click Disable. The data in the Logstore is no longer shipped to OSS.

  • View the statuses and error messages of shipping tasks.

    You can view the shipping tasks of the previous two days and the statuses of the tasks.

    • Statuses of a shipping task
      Status Description
      Succeeded The shipping task succeeded.
      Running The shipping task is running. Check whether the task succeeds later.
      Failed The shipping task failed. If the task cannot be restarted due to external causes, troubleshoot the failure based on the error message and retry the task.
    • Error messages
      If a shipping task fails, an error message is returned for the task.
      Error message Description Method
      UnAuthorized The error message returned because the AliyunLogDefaultRole role does not have the required permissions. To fix the error, check the following configurations:
      • The AliyunLogDefaultRole role must be created for the Alibaba Cloud account to which the destination OSS bucket belongs. Check whether the AliyunLogDefaultRole role is created.
      • Check whether the Alibaba Cloud account ID that is configured in the permission policy is valid.
      • Check whether the AliyunLogDefaultRole role is granted the write permissions on the destination OSS bucket.
      • Check whether the ARN of the AliyunLogDefaultRole role that you entered in the RAM Role field is valid.
      ConfigNotExist The error message returned because the task does not exist. Check whether the data shipping feature is disabled. Enable the data shipping feature, configure a shipping rule for the task, and then retry the task.
      InvalidOssBucket The error message returned because the specified OSS bucket does not exist. To fix the error, check the following configurations:
      • Check whether the destination OSS bucket resides in the same region as the source project.
      • Check whether the specified bucket name is valid.
      InternalServerError The error message returned because an internal error has occurred in Log Service. Retry the failed shipping task.
    • Retry a shipping task

      By default, Log Service retries a shipping task based on the retry policy if the task fails. You can also manually retry the task. By default, Log Service retries all tasks of the previous two days. The minimum interval between two consecutive retries is 15 minutes. The first time a task fails, Log Service retries the task 15 minutes later. The second time the task fails, Log Service retries the task 30 minutes later. The third time the task fails, Log Service retries the task 60 minutes later. The same method is used for subsequent attempts.

      To immediately retry a failed task, you can click Retry All Failed Tasks or Retry on the right of the task. You can also call an API operation or use an SDK to retry a task.

FAQ

How do I obtain the ARN of a RAM role?

  1. To obtain the ARN of a RAM role, log on to the RAM console.
  2. In the left-side navigation pane, click RAM Roles.
  3. In the Role Name column, click the AliyunLogDefaultRole role.
  4. On the page that appears, obtain the ARN of the RAM role in the Basic Information section.
    Obtain the ARN of a RAM role