All Products
Search
Document Center

Simple Log Service:Stability and limits of OSS data shipping

Last Updated:Mar 13, 2024

This topic describes the stability and limits of the new version of Object Storage Service (OSS) data shipping.

Stability

Data read from Simple Log Service

Item

Description

Availability

High availability is provided.

If an OSS data shipping job fails to read data from Simple Log Service due to an error in Simple Log Service, the job is retried at least 10 times. If the job still fails, an error is reported, and the job is restarted.

Data write to OSS

Item

Description

Concurrency

Data shipping instances can be created based on shards, and the resources that are used for data shipping can be scaled out.

If shards in the source Logstore of a data shipping instance are split, the required resources can be scaled out within a few seconds to accelerate the data export process.

Data consistency

The required resources are scaled out based on specified consumer groups to ensure data consistency. An offset is submitted only after data is shipped to OSS. This helps ensure that all data is shipped to OSS.

Monitoring and alerting

Item

Description

Monitoring and alerting

You can monitor data shipping jobs in real time based on metrics such as the latency and traffic of data shipping jobs. You can configure custom alerts based on your business requirements to report exceptions that occur during data shipping at the earliest opportunity. For example, if the data shipping instances that are used to export data are insufficient or the network quota is exceeded, alerts are triggered. For more information, see Configure alert monitoring rules to monitor data shipping jobs of the new version for OSS.

Limits

Network

Item

Description

Network type

Data transfer over an internal network of Alibaba Cloud is fast and stable.

Permission management

Item

Description

Authorization

The permissions to ship data to OSS and access data must be granted. For more information, see Authorization overview.

Server-side encryption

If server-side encryption is enabled, you must grant additional permissions to the Resource Access Management (RAM) role that is involved. For more information, see OSS configuration documentation.

Read traffic

Item

Description

Read traffic

Simple Log Service specifies upper limits on read traffic in a single project and a single shard. For more information, see Data read and write.

If a limit is exceeded, you must split shards or apply for a limit increase in your project. If an OSS data shipping job fails to read data because a limit is exceeded, the job is retried at least 10 times. If the job still fails, an error is reported, and the job is restarted.

Data write to OSS

Item

Description

Concurrent instances

The number of concurrent instances must be the same as the number of shards. The shards include readwrite shards and readonly shards.

Data shipping

  • The size of log data in a shard determines the size of raw log data that is shipped to OSS and stored in an OSS object. The value range for the size of log data in a shard is 5 to 256. Unit: MB.

  • The value range for the interval between two data shipping jobs that ship the log data of the same shard is 300 to 900. Unit: seconds.

  • An OSS object is generated for each data shipping job of a concurrent instance.

    Important

    When you create an OSS data shipping job, you can configure the Shipping Size and Shipping Time parameters to specify the frequency at which the data in a shard is shipped. If one of the conditions specified by the two parameters is met, data is shipped.

Time partition

In a data shipping job, data is shipped to OSS by performing multiple shipping operations. Each data shipping operation ships data to OSS and stores the data to a different OSS object. The path to an OSS object is determined by the earliest point in time at which Simple Log Service receives the data shipped to the OSS object. This point in time is specified by receive_time.

File format

After data is shipped to OSS, the data can be stored in one of the following formats: CSV, JSON, Parquet, and ORC. For more information, see JSON format, CSV format, Parquet format, and ORC format.

Compression method

The following compression methods are supported: snappy, gzip, and zstd. Non-compression is also supported.

OSS bucket

  • You can ship data only to an existing OSS bucket for which the Write Once Read Many (WORM) feature is disabled. The bucket must reside in the same region as your Simple Log Service project. For more information about the WORM feature, see Retention policies.

    Note

    To ensure the exactly-once semantics, an OSS data shipping job modifies the existing objects in the related bucket in some cases, such as when network jitter occurs. If the WORM feature is enabled for the bucket, the data shipping job tries up to five times to write data to a renamed object. This ensures that data shipping can be executed without being blocked. The object is renamed {Original object name}.worm.{Number of retries}. For example, if the original object name is test.file, the object is renamed test.file.worm.1or test.file.worm.2 based on the number of retries.

  • You can ship data to OSS buckets of the following storage classes: Standard, Infrequent Access (IA), Archive, and Cold Archive. By default, the storage class of the generated OSS objects that store the shipped data is the same as the storage class of the specified OSS bucket. For more information about storage classes, see Overview.

  • The following limits apply to an IA, Archive, or Cold Archive OSS bucket: minimum storage period and minimum billable size. We recommend that you specify a storage class for your OSS bucket based on your business requirements. For more information about storage classes, see Overview.

Configuration items

Item

Description

Shipping latency

The shipping latency cannot exceed the data retention period of the Logstore from which you want to ship data.

We recommend that you reserve a buffer period to prevent data loss. For example, if the data retention period of a Logstore is 30 days, we recommend that you set the shipping latency to a value that is less than or equal to 25 days.

Data shipping management

Item

Description

Pause of a data shipping job

If you pause a data shipping job, the job records the cursor of the last log that is shipped. After you resume the job, the job continues to ship logs from the recorded cursor. Simple Log Service implements the following mechanism when you pause a data shipping job:

  • If you pause a data shipping job for a period of time and the retention period of the data that you want to ship does not elapse, the system continues to ship data from the last cursor after you resume the job.

  • If you pause a data shipping job for a period of time and the retention period of the data that you want to ship elapses, the system continues to ship data from the data that is the closest to the last cursor after you resume the job.