All Products
Search
Document Center

MaxCompute:Overview of the streaming data tunnel

Last Updated:Mar 26, 2026

MaxCompute Streaming Tunnel lets you write data to MaxCompute in streaming mode using a dedicated set of APIs and backend services. These APIs significantly reduce the development costs of distributed services and remove the performance bottlenecks of MaxCompute Tunnel in high-concurrency and high-QPS (queries per second) scenarios such as partition locking conflicts, small-file fragmentation, and complex synchronization code.

MaxCompute Streaming Tunnel has been in public preview since January 1, 2021, and is free of charge during the preview period. Follow Service notices to stay informed about future billing changes.

When to use Streaming Tunnel

MaxCompute Streaming Tunnel complements MaxCompute Tunnel rather than replacing it. Use this table to decide which channel fits your workload:

Dimension MaxCompute Streaming Tunnel MaxCompute Tunnel
Data form Streaming rows Batched files
Concurrency High concurrency supported; no partition locking contention Concurrent writes can cause partition locking conflicts
Write throughput Optimized for high QPS; prevents small-file fragmentation Small batch size at high QPS generates many small files
Incremental data Asynchronously merged in the background without service interruption No built-in async merge; data is written as-is
Partitioning Automatic partitioning across concurrent jobs Manual partition management required
Best for Real-time log ingestion, stream processing results, message queue sync Large-batch ETL, periodic bulk loads

Key capabilities

  • Streaming semantic APIs: Help facilitate the development of distributed data synchronization services, reducing development costs.

  • Automatic partitioning: Eliminates concurrent partition locking when multiple synchronization jobs write to the same table simultaneously.

  • Asynchronous data merging: Merges incremental data in the background without interrupting active write operations, improving storage efficiency and preventing small-file accumulation.

    • Data aggregation (Merge): This feature improves storage efficiency.

    • zorder by sorting: This feature improves storage and query efficiency.

  • Asynchronous zorder by sorting for incremental data. For more information about zorder by, see Insert or overwrite data (INSERT INTO | INSERT OVERWRITE).

  • Complete isolation between the data link and metadata access. This feature resolves lock contention delays and errors that are caused by metadata access in high-concurrency write scenarios.

  • Use cases

    Scenario Description
    Real-time event log ingestion Write log data directly into MaxCompute for downstream batch processing—no intermediate storage service needed, which reduces pipeline costs.
    Stream processing result storage Persist Flink or other stream computing results into MaxCompute without concurrency or batch size limits, avoiding small-file accumulation from high-frequency writes. MaxCompute Streaming Tunnel ensures the availability of streaming services in scenarios that involve high-concurrency locking.
    Message queue synchronization Sync data from DataHub or ApsaraMQ for Kafka into MaxCompute at high concurrency and large batch volumes, replacing workarounds previously needed with the Simple Message Queue connector.

    Integrate with upstream services

    By default, Realtime Compute for Apache Flink, DataWorks, and ApsaraMQ for Kafka write to MaxCompute via MaxCompute Tunnel. To switch to Streaming Tunnel:

    Service How to enable Streaming Tunnel
    Realtime Compute for Apache Flink Use the built-in Streaming Tunnel plug-in provided by Realtime Compute for Apache Flink.
    DataWorks Contact the DataWorks engineer on duty to enable Streaming Tunnel in the background.
    ApsaraMQ for Kafka Contact the Kafka engineer on duty to enable Streaming Tunnel in the background.

    Limitations

    Table or partition locking during writes

    MaxCompute Tunnel Service locks the target table or partition for the duration of a streaming write. All DML operations that modify data—such as insert into and insert overwrite—are blocked until the write completes and the lock is released.

    Schema modification not supported

    If the schema of the target table is modified while Streaming Tunnel is active, streaming data cannot be written to the table.

    Temporary storage overhead for hot data

    When asynchronous data merging or ZORDER BY is enabled, Streaming Tunnel retains two copies of data written within the previous hour: the original ingested data and the asynchronously merged copy. This redundant storage is automatically cleaned up after the default retention period of 1 hour.

    Plan storage capacity accordingly if your workload has a high ingestion rate during the merge window.