All Products
Search
Document Center

Hologres:Multi-cluster and auto scaling (Beta)

Last Updated:Oct 30, 2025

Starting from V4.0, Hologres supports Multi-cluster and Auto Scaling for virtual warehouse instances. A virtual warehouse can scale out to multiple clusters and automatically adjust the number of clusters based on the workload. This supports high-concurrency requests and resource isolation within the virtual warehouse.

How it works

  • Without Multi-cluster, all reserved resources of a virtual warehouse belong to a single cluster. Requests sent to this virtual warehouse share these resources. See Architecture of virtual warehouse instances.

  • With Multi-cluster enabled, a virtual warehouse operates with multiple, physically isolated clusters. The Frontend (FE) Node then employs load balancing to automatically distribute incoming requests across these clusters.

  • If you enable Multi-cluster and Auto Scaling, the virtual warehouse dynamically provisions new clusters with elastic resources during peak demand, to manage greater concurrent workloads. During periods of lower load, it scales in by releasing idle elastic resources, significantly reducing costs.

image

Use cases

Multi-cluster

  • Ideal use cases: high-concurrency, small-to-medium queries. Multi-cluster excels here by leveraging cluster isolation and FE Node load balancing. It supports more concurrent requests and automatically isolates workloads for better performance.

  • Not suitable for: large, low-concurrency tasks. Scenarios like large-scale batch data ingestion (virtual warehouse 2 in How it works) are better suited for scale-up capabilities. Consider manual or scheduled scaling to allocate more resources to a single, powerful cluster.

Auto Scaling

Ideal use cases:

  • High-concurrency, small-to-medium queries: See ideal use cases of Multi-cluster.

  • Unpredictable workload peaks: This demand cannot be effectively managed by manual cluster increases or scheduled scaling.

Key concepts

For definitions of Multi-cluster concepts and resources at the instance and virtual warehouse levels, see Overview of resource scalability.

The following example explains instance resource usage:

  • Instance

    • Reserved resources: 64 Compute Units (CUs), which include:

      • Allocated: 32 CUs, assigned to the virtual warehouse init_warehouse .

      • Unallocated: 32 CUs. Can be used to create a new virtual warehouse or allocated to init_warehouse.

    • Elastic instance resources: 32 CUs, provisioned via Auto Scaling for init_warehouse.

  • Virtual warehouse (init_warehouse):

    • Reserved clusters: 1.

    • Per-cluster specifications: 32 CUs.

    • Reserved resources: 32 CUs (1 × 32).

    • Current total clusters: 2 (1 reserved cluster + 1 elastic cluster).

    • Elastic resources: 32 CUs, provisioned via Auto Scaling.

    • Total resources: 64 CUs (32 CUs of reserved resources and 32 CUs of elastic resources).

Billing

  • Reserved resources: The base resources allocated to the virtual warehouse, billed according to the instance's payment methods (subscription or pay-as-you-go).

  • Elastic resources: Additional resources provisioned via Auto Scaling for a virtual warehouse. Cost is calculated as follows: Cost = Elastic resource usage (CU count * hours) × Unit price. For specific unit prices, see Billing overview.

    Note
    • Hologres monitors elastic resource usage every minute. At the end of each hour, it calculates total usage and issues an hourly bill for that hour, automatically deducting fees from your account balance.

    • Elastic resources are independent of the unallocated reserved instance resources. Therefore, even if reserved resources are available within the instance, Auto Scaling will provision additional elastic resources instead of using them.

Limitations

  • Multi-cluster and Auto Scaling are available only for Hologres V4.0+.

  • These features are supported only for virtual warehouse instances.

  • Supported regions

    • Multi-cluster: All regions

    • Auto Scaling:

      Region

      Supported?

      Notes

      China (Hangzhou)

      Yes

      This feature is in public preview in this region. To apply for a trial, use your Alibaba Cloud account to fill out this form.

      China (Shanghai), China (Beijing), China (Shenzhen), China (Hong Kong), Singapore, Germany (Frankfurt), US (Silicon Valley), US (Virginia), Japan (Tokyo), Malaysia (Kuala Lumpur), Indonesia (Jakarta), China (Shanghai) Finance Cloud, China (Beijing) Gov Cloud, China (Shenzhen) Finance Cloud

      No

      Trial applications are not available in these regions.

Important notes

  • To use Multi-cluster and Auto Scaling, you need the following permissions:

  • Adding or removing clusters in a virtual warehouse may have certain impacts. For details, see Manage virtual warehouses.

  • You cannot use Scheduled Scaling and Auto Scaling simultaneously for the same virtual warehouse.

  • A virtual warehouse with Auto Scaling enabled still supports all standard management operations in the console, such as scale-up/down, start/stop, and deletion.

Use Multi-cluster

To use this feature, modify the number of reserved clusters for a virtual warehouse. For detailed instructions, see Manage virtual warehouses.

Use Auto Scaling

You can enable Auto Scaling for a virtual warehouse to automatically scale out its elastic clusters based on the workload (resource utilization and query queue length), complementing its reserved clusters.

Procedure

  1. Log on to the Hologres management console. In the top menu bar, select a region.

  2. In the left menu, choose Instances. Click your target instance ID to go to the details page.

  3. In the left submenu, click Virtual Warehouse Management. On the page that appears, click the Auto-scaling tab.

  4. Switch on Enable Auto Scaling. Set Maximum Clusters and click Save.

Example

After enabling Auto Scaling as shown above (with a single-cluster specification of 32 CUs, 1 reserved cluster, and a maximum of 4 clusters), verify its functionality as follows. This example uses pgbench, the native performance stress testing tool for PostgreSQL.

  1. Create test tables in Hologres and load data.

    CREATE TABLE tbl_1 (col1 int, col2 int, col3 text);
    CREATE TABLE tbl_2 (col1 int, col2 int, col3 text);
    INSERT INTO tbl_1 SELECT i, i+1, md5(random()::text) FROM generate_series (0, 500000) AS i;
    INSERT INTO tbl_2 SELECT i, i+1, md5(random()::text) FROM generate_series (0, 500000) AS i;
  2. On the stress testing server, create a SQL file named select.sql and add the following SQL statement:

    EXPLAIN ANALYZE SELECT * FROM tbl_1 LEFT JOIN tbl_2 ON tbl_1.col3 = tbl_2.col3 ORDER BY 1;
  3. On the stress testing server, set the password as an environment variable.

    export PGPASSWORD='<AccessKey_Secret>'
  4. Run the following stress testing command. For details, see Connect to a Hologres instance for data development.

    pgbench
    -c 30 \
    -j 30 \
    -f select.sql \
    -d <Database> \
    -U <AccessKey_ID> \
    -h <Endpoint> \
    -p <Port> \
    -T 1800

    During the stress test, the monitoring metrics for the virtual warehouse are as follows:

    • Cluster CPU utilization:image

      • The load on the reserved cluster remains high, triggering Auto Scaling (Point 1) to add a new cluster.

      • After the test completes, the load on both clusters becomes low, triggering Auto Scaling (Point 2) to remove one cluster.

    • Virtual warehouse CPU utilization:image

      • Before Auto Scaling adds a new cluster, the CPU utilization of the virtual warehouse consistently exceeds 85%.

      • After the new cluster is added, the overall CPU utilization of the virtual warehouse drops to around 70%.

Monitoring and alerts

Metrics

View the following metrics in the Hologres management console, and configure alert rules for them as needed. For details, see Monitoring metrics in the Hologres console.

  • Cluster CPU utilization

  • Cluster memory usage

  • Number of cores provisioned by Auto Scaling for the virtual warehouse

Elastic event execution logs

  1. Go to the Virtual Warehouse Management page and click the Elastic Event Execution Logs tab.

  2. Select a time range to view the execution history of scaling events, which includes the execution time, virtual warehouse, execution status, event type, number of reserved clusters, and target number of clusters.

CloudMonitor events

CloudMonitor records all horizontal scaling events.

  1. Go to the CloudMonitor Event Center. On the System Events page, select Hologres in the Event Monitoring area to monitor auto scaling events.

    • Instance:Warehouse:AutoElastic:Start: The start event for Auto Scaling in a virtual warehouse.

    • Instance:Warehouse:AutoElastic:Finish: The completion event for Auto Scaling in a virtual warehouse.

    • Instance:Warehouse:AutoElastic:Failed: The failure event for Auto Scaling in a virtual warehouse.

  2. You can use these CloudMonitor events to configure notifications, alerts, and other actions. For details, see Using system events for alerting.

The following is an example of the event details for a failed Auto Scaling event:

{
    "status": "Failed",
    "instanceName": "<instance_id>",
    "resourceId": "<instance_resource_id>",
    "content": {
        "AutoElasticCPU": <cpu_num>,
        "ScaleType": "ScaleOut",
        "ScheduleId": "xxxxxx",
        "WarehouseId": "<warehouse_id>",
        "WarehouseName": "<warehouse_name>" 
    },
    "product": "hologres",
    "time": 1722852008000,
    "level": "WARN",
    "regionId": "<region>",
    "id": "<event_id>",
    "groupId": "0",
    "name": "Instance:Warehouse:TimedElastic:Failed"
}

ActionTrail

ActionTrail records operations performed in the Hologres management console, such as editing Auto Scaling settings, and the actual cluster scaling operations that Auto Scaling performs. For details, see Event audit logs.