This topic describes the stability and limitations of MaxCompute shipping (New Version).
Stability
Reading from Simple Log Service
|
Item |
Description |
|
Availability |
High availability. If an Simple Log Service read error occurs, the MaxCompute shipping task automatically retries at least 10 times. If the issue persists, the task reports an error and then restarts. |
Writing to MaxCompute
|
Item |
Description |
|
Concurrency |
The service creates a shipping instance for each Simple Log Service shard, which enables rapid scaling. If a shard in the source Simple Log Service is split, new shipping instances are created within seconds to accelerate data export. |
|
No data loss |
Data shipping tasks use a consumer group to provide consistency guarantees. The task commits the offset only after successfully shipping the data. This at-least-once delivery model ensures that no data is lost. |
|
Schema changes |
If you add a new column to the MaxCompute table during the shipping process, the task writes the new column only to new partitions, not to existing or current ones. Note
Due to limitations in MaxCompute, you cannot perform schema modifications such as inserting, updating, or deleting columns, or changing the column order while a shipping task is active. Attempting these modifications causes the task to fail without recovery. For more information, see MaxCompute usage limits. |
Handling dirty data
|
Error type |
Count as failure |
Description |
|
Partition error |
Yes |
Common causes include an invalid partition or a non-existent partition key column. The task does not write this data row to MaxCompute. |
|
Invalid data column |
No |
Common causes include a data type mismatch or a type conversion failure. The task does not write data in this column to MaxCompute, but it writes data in other columns of the same row normally. |
|
Data column too long |
No |
A common cause is that the data exceeds the length limit for the string or varchar data type. The task truncates the data in this column before writing it to MaxCompute. Data in other columns of the same row is written as normal. |
Monitoring and alerts
|
Item |
Description |
|
Monitoring and alerts |
Data shipping includes comprehensive monitoring features to track metrics such as task latency and traffic in real time. You can configure custom alerts based on your business requirements to promptly detect shipping issues, such as an insufficient number of shipping instances or network quota limits. For more information, see Configure alerts for a MaxCompute shipping task (New Version). |
Restarting a task
|
Item |
Description |
|
Too many partitions |
If a task restarts while writing to a large number of partitions, write operations that take longer than 5 minutes to complete may cause data duplication. |
|
Data write failure |
If a task restarts after a write failure to MaxCompute due to authorization or network errors, partial data duplication may occur. |
Limitations
Network
|
Limitation |
Description |
|
Network for intra-region shipping |
Intra-region shipping uses the Alibaba Cloud internal network for data transmission. This provides higher network stability and speed. |
Read traffic
|
Limitation |
Description |
|
Read traffic |
Each project and shard has a maximum read traffic limit. For more information, see Data reads and writes. If this limit is exceeded, you must split the shard or submit a request to increase the read traffic limit for the project. If the limit is exceeded, the shipping task fails and then retries at least 10 times. If the issue persists, the task reports an error and then restarts. |
Writing to MaxCompute
|
Limitation |
Description |
|
Concurrent instances |
A maximum of 64 concurrent shipping instances are supported. If the number of Simple Log Service shards exceeds 64, the service consolidates multiple shards into a single shipping instance. The service attempts to distribute the shards evenly across instances. |
|
Write threshold |
Important
Exceeding the MaxCompute write limit can cause instability and trigger throttling, which results in |
|
Table schema modification |
MaxCompute shipping (New Version) uses the MaxCompute Tunnel Service for stream writing. During this process, the MaxCompute Tunnel Service prohibits schema modifications on the target table, such as inserting, updating, or deleting columns, or changing the column order. For more information, see Overview of MaxCompute Tunnel Service. This restriction prevents you from using both the new and legacy versions of data shipping to write to the same MaxCompute table concurrently. |
|
Unsupported table types |
Data shipping to MaxCompute external tables, transactional tables, or clustered tables is not supported. |
|
Table schema changes |
To apply schema changes to the target MaxCompute table, pause the shipping task for 20 minutes and then restart it. |
|
Start time |
Note
Due to the slot and queries per second (QPS) limits of MaxCompute, shipping historical data can easily exceed the write threshold and is therefore not supported. |
Permission management
|
Limitation |
Description |
|
Write authorization |
You can grant write permissions using a RAM user or a RAM role, which requires a separate configuration in MaxCompute. |
Data types
-
Regular columns
Type
Example
Description
string
"hello"
Maximum length: 8 MB.
datetime
"2021-12-22 05:00:00"
Data in Simple Log Service must meet the format requirements of MaxCompute.
date
"2021-12-22"
Data in Simple Log Service must meet the format requirements of MaxCompute.
timestamp
1648544867
Supports millisecond or second precision.
decimal
1.2
Data in Simple Log Service must meet the format requirements of MaxCompute.
char
"hello"
Maximum length: 255 bytes.
varchar
"hello"
Maximum length: 65,535 bytes.
binary
"hello"
Maximum length: 8 MB.
bigint
123
Supports up to INT64.
boolean
1
-
The values
1,t,T,true,TRUE, andTrueare parsed as True. -
The values
0,f,F,false,FALSE, andFalseare parsed as False.
double
1.2
Supports 64-bit floating-point numbers.
float
1.2
Supports 32-bit floating-point numbers.
integer
123
Supports up to INT32.
smallint
12
Supports up to INT16.
tinyint
12
Supports up to INT8.
-
-
Partition key columns
Limitation
Description
Partition key column
The value is processed as a string and must meet the format requirements for MaxCompute partition key columns.
Configuring a log field other than
__partition_time__or__receive_time__Configuring a partition key column to use a log field other than
__partition_time__or__receive_time__may affect shipping performance.
Managing shipping tasks
|
Limitation |
Description |
|
Pausing a shipping task |
A shipping task records the log cursor of the last shipped item. When the task resumes, it continues from the recorded cursor. Therefore, the following behavior applies when you pause a shipping task:
|
MaxCompute IP whitelist
|
Limitation |
Description |
|
Enabling an IP whitelist in your MaxCompute project, such as a classic network IP whitelist, may cause shipping tasks to fail. |
Run the following commands in MaxCompute to resolve shipping failures caused by the IP whitelist.
For more information, see Resolve shipping failures caused by an IP whitelist. |