This topic describes the stability and limits of the new version of the OSS data shipping feature.
Feature entry point
Stability
Data reads from Simple Log Service
Stability | Description |
Availability | High availability. If Simple Log Service returns an error and data cannot be read, the OSS data shipping job retries the operation at least 10 times. If the operation still fails, the task execution reports an error and the task restarts. |
Data writes to OSS
Item | Description |
Degree of concurrency | Data is partitioned by Simple Log Service shard and shipping instances are created. This supports rapid scale-out. If a shard in the source Logstore is split, the shipping instances can be scaled out within seconds to accelerate data exporting. |
No data loss | OSS data shipping jobs are extended based on consumer groups to ensure consistency. The offset is committed only after the data is shipped. This ensures that the offset is not committed before data is written to OSS, which prevents data loss. |
Monitoring and alerts
Item | Description |
Monitoring and alerts | Data shipping provides comprehensive monitoring features that allow you to track metrics such as the latency and traffic of data shipping jobs in real time. You can configure custom alerts as needed to promptly detect issues, such as an insufficient number of export instances or network quota limits. For more information, see Configure alerts for an OSS data shipping job (new version). |
Limits
Network
Limit | Description |
Network type | Data is transmitted over the Alibaba Cloud internal network. This ensures network stability and speed. |
Permission management
Limit | Description |
Authorization | This involves permissions for OSS data shipping operations and data access. For more information, see Prepare permissions. |
Server-side encryption | If you enable server-side encryption, you must grant additional permissions to the RAM role. For more information, see OSS configuration. |
Read traffic
Limit | Description |
Read traffic | A traffic limit applies to each project and each shard. For more information, see Data reads and writes. If the traffic limit is exceeded, split the shard or request a quota increase for the read traffic of the project. If the limit is exceeded, the OSS data shipping job fails to read data and retries the operation at least 10 times. If the operation still fails, the task execution reports an error and the task restarts. |
Data writes to OSS
Limit | Description |
Concurrent instances | The number of concurrent instances is the same as the number of shards, including read/write shards and read-only shards. |
Shipping limits |
|
Time-based partitioning | OSS data shipping is performed in batches. A file is written for each batch. The file path is determined by the minimum receive_time (the time when data arrives at Simple Log Service) in the batch. |
File format | After data is shipped to OSS, it can be stored in CSV, JSON, Parquet, or ORC format. For more information, see JSON format, CSV format, Parquet format, and ORC format. |
Compression method | The snappy, gzip, and zstd compression methods are supported. You can also choose not to compress data. |
OSS bucket |
|
Configuration items
Limit | Description |
Delayed shipping | The time that you set for the Delayed Shipping parameter cannot exceed the data retention period of the current Logstore. Reserve a buffer period to prevent data loss. For example, if the data retention period of the Logstore is 30 days, set the delayed shipping period to a value that does not exceed 25 days. |
Job management
Limit | Description |
Pausing a data shipping job | A data shipping job records the log cursor of the last shipping operation. When the job is resumed, it continues to ship data from the recorded cursor. The following mechanisms apply when you pause a data shipping job.
|