All Products
Search
Document Center

Simple Log Service:Data shipping to OSS (new version)

Last Updated:Aug 29, 2023

This topic describes the stability and limits of the data shipping feature of the new version.

Stability

Data reads from Simple Log Service

Item

Description

Availability

High availability is provided.

If an OSS data shipping job fails to read data from Simple Log Service due to an error in Simple Log Service, the job is retried at least 10 times. If the job still fails, an error is reported, and the job is restarted.

Data writes to OSS

Item

Description

Concurrency

Data shipping instances can be created based on shards, and the resources that are used for data shipping can be scaled out.

If shards in the source Logstore of a data shipping instance are split, the required resources can be scaled out within a few seconds to accelerate the data export process.

Data consistency

The required resources are scaled out based on consumer groups to ensure data consistency. An offset is submitted only after data is shipped to OSS. This helps ensure that all data is shipped to OSS.

Monitoring and alerting

Item

Description

Monitoring and alerting

You can monitor data shipping jobs in real time based on metrics such as the latency and traffic of data shipping jobs. You can configure custom alert rules based on your business requirements to identify exceptions that occur during data shipping at the earliest opportunity. For example, if the data shipping instances that are used to export data are insufficient or the network quota limit is exceeded, alerts are triggered. For more information, see Configure alert monitoring rules to monitor data shipping jobs of the new version for OSS.

Limits

Network

Item

Description

Network type

Data is transmitted over an Alibaba Cloud internal network. The network stability and speed can be ensured.

Permission management

Item

Description

Authorization

The permissions to ship data to OSS and access data must be granted. For more information, see Authorization overview.

Server-side encryption

If server-side encryption is enabled, you must grant additional permissions to the RAM role that is involved. For more information, see OSS documentation.

Read traffic

Item

Description

Read traffic

Simple Log Service sets upper limits on read traffic in a single project and a single shard. For more information, see Data read and write.

If a limit is exceeded, you must split shards or apply to increase the limit in your project. If a job fails to read data because a limit is exceeded, the job is retried at least 10 times. If the job still fails, an error is reported, and the job is restarted.

Data writes to OSS

Item

Description

Concurrent instances

The number of concurrent instances must be the same as the number of shards. The shards include readwrite shards and readonly shards.

Data shipping

  • The size of log data in a shard determines the size of raw log data that is shipped to OSS and stored in an OSS object. The value range for the size of log data in a shard is 5 to 256. Unit: MB.

  • The value range for the interval between two data shipping jobs that ship the log data of the same shard is 300 to 900. Unit: seconds.

  • An OSS object is generated each time a concurrent instance ships data.

    Important

    When you create an OSS data shipping job, you can configure the Shipping Size and Shipping Time parameters to specify the frequency at which the data in a shard is shipped. If one of the conditions specified by the two parameters is met, data is shipped.

Time partition

In a data shipping job, data is shipped to OSS by performing multiple shipping operations. Each data shipping operation ships data to OSS and stores the data to a different OSS object. The path to an OSS object is determined by the earliest point in time at which Simple Log Service receives the data shipped to the OSS object. This point in time is specified by receive_time.

File format

After data is shipped to OSS, the data can be stored in one of the following formats: CSV, JSON, Parquet, and ORC. For more information, see JSON format, CSV format, Parquet format, and ORC format.

Compression method

The following compression methods are supported: snappy, gzip, and zstd. Non-compression is also supported.

OSS Bucket

  • You can ship data only to an existing OSS bucket that resides in the same region as your Simple Log Service project.

  • You can ship data to OSS buckets of the following storage classes: Standard, Infrequent Access (IA), Archive, and Cold Archive. By default, the storage class of the generated OSS objects that store the shipped data is the same as the storage class of the specified OSS bucket. For more information about storage classes, see Overview.

  • The following limits apply to an IA, Archive, or Cold Archive OSS bucket: minimum storage period and minimum billable size. We recommend that you specify a storage class for your OSS bucket based on your business requirements. For more information about storage classes, see Overview.

Parameter

Item

Description

Shipping Latency

The value of the Shipping Latency parameter cannot exceed the data retention period of the specified Logstore.

We recommend that you reserve a buffer period to prevent data loss. For example, if the data retention period of a Logstore is 30 days, we recommend that you set the Shipping Latency parameter to a value that is less than or equal to 25 days.

Data shipping job management

Item

Description

Pause of a data shipping job

If you pause a data shipping job, the job records the cursor of the last log that is shipped. If you resume the job, the job continues to ship data from the recorded cursor. Simple Log Service provides the following mechanism when you pause a data shipping job:

  • If you pause a data shipping job for a period of time and the retention period of the data to be shipped does not expire, the system continues to ship data from the last cursor after you resume the job.

  • If you pause a data shipping job for a period of time and the retention period of the data to be shipped expires, the system starts to ship data from the data that is closest to the last cursor after you resume the job.