All Products
Search
Document Center

E-MapReduce:Compaction Service

Last Updated:Mar 09, 2026

The Compaction Service is a new feature in EMR Serverless StarRocks 3.5, currently in Beta. It runs compaction tasks on a dedicated service, separate from your business compute groups, providing workload isolation, elastic scaling, and better performance. This topic describes the features of the Compaction Service and how to use it.

Overview

Core Value

Capability

Description

Workload isolation

Compaction runs on a separate service. This prevents resource contention with business tasks such as queries and imports, which ensures business stability.

Elastic scaling

The Compaction Service supports setting a minimum (Min CU) and maximum (Max CU) number of Compute Units (CUs). It automatically scales based on the compaction load. This ensures timely compaction while reducing costs.

Out-of-the-box

A Compaction Service is automatically created in version 3.5. Enable it with one click in the console. No extra purchases or configurations are needed.

Performance optimizations

The Compaction Service includes the following performance optimizations in addition to isolation:

  1. Peer Cache reads: When a compaction task runs, it pulls data directly from cached nodes in the business compute group (Peer Cache), which avoids accessing Object Storage Service (Remote I/O) and significantly improves compaction read performance.

  2. Cache push: After compaction is complete, the Compaction Service asynchronously pushes the merged data files to the nodes of the business compute group. This prevents Object Storage Service access due to a cache miss and ensures that query performance is not affected.

Prerequisites

  • EMR Serverless StarRocks version 3.5 or later.

  • A cluster in shared-data mode.

Enable the Compaction Service

Enable the Compaction Service in the EMR Serverless StarRocks console:

  1. Go to the EMR Serverless StarRocks instance list page.

    1. Log on to the E-MapReduce console.

    2. In the navigation pane on the left, choose EMR Serverless > StarRocks.

    3. In the top menu bar, select a region as needed.

  2. Click the ID of the target instance.

  3. Click the Compaction Service tab. On the Basic Information page, click Start Service.

  4. In the pane that appears on the right, set Min CU and Max CU.

  5. After you complete the configuration, click Start Service.

Disable the Compaction Service

Click Shutdown Service in the console. In the pane that appears, select the Confirm Risk checkbox, and then click Confirm Shutdown. After the service is disabled, compaction reverts to running on the business compute groups. In-progress tasks will complete normally.

Elastic scaling

CU configuration

The Compaction Service lets you set a scaling range:

Parameter

Description

Recommendation

Min CU

The minimum number of CUs. The service scales in to this value when idle.

Set this to the minimum value that meets your basic compaction needs.

Max CU

The maximum number of CUs. The service scales out to this value during peak hours.

Set this value based on your peak write throughput and Compaction Score.

Scaling policy

The Compaction Service automatically scales based on the following metrics:

  • Compaction Score: Reflects the accumulation of data versions. A higher score indicates greater compaction pressure.

  • Task load: The ratio of the current number of compaction tasks to available resources.

The system automatically scales out when the Compaction Score continues to rise or when tasks are queued. When the load decreases, the system gradually scales in to the Min CU value.

Best practices

Use the Compaction Service in the following scenarios:

  • High write throughput scenarios: Continuous, high-frequency writes cause the Compaction Score to rise, which affects query performance.

  • Query-sensitive scenarios: Your business is sensitive to query latency and you do not want compaction to compete for query resources.

  • Cost optimization scenarios: You want to use compaction resources on demand through elastic scaling to reduce standing costs.

Notes

  • The Compaction Service is only for clusters in shared-data mode.

  • After you enable the Compaction Service, compaction tasks for all tables are scheduled to run on the Compaction Service.

  • The CU resources for the Compaction Service are billed separately based on usage. Configure the Min and Max CU values reasonably.

  • If you disable the Compaction Service, compaction automatically reverts to running on each business compute group. In-progress tasks will complete normally.

  • Enable the Compaction Service for the first time during off-peak hours to monitor its impact on the system.