Hologres lets you run ETL pipelines, BI dashboards, and online recommendation services on a single instance. When multiple teams share the same instance, workloads compete for resources—ETL peaks can slow dashboards, and large ad-hoc queries can trigger out-of-memory (OOM) errors for other users. This guide explains how to use workload isolation, scheduled scaling, Serverless Computing, and query queues together to balance stability and cost across mixed workloads.
Scenario overview
The strategies in this guide are illustrated with a real-world e-commerce scenario in which three teams share one Hologres instance.
| Team | Tasks | Characteristics |
|---|---|---|
| Data team | Real-time and near-real-time ETL using Flink and DataWorks Data Integration; near-real-time ETL using Dynamic Table; batch ETL using MaxCompute and Hologres | ETL peaks at night; batch ETL runs in early morning |
| Data analysts | BI dashboards for sales data; self-service analytics | BI traffic peaks during work hours, with occasional night surges; self-service queries can be resource-intensive |
| Recommendation team | Real-time product recommendations using primary key lookups | Traffic peaks every evening |
How resource management features interact
Before choosing a strategy, understand how Hologres resource management features interact. Some combinations are mutually exclusive.
| Feature | Works with virtual warehouses | Works with Serverless Computing | Works with query queues |
|---|---|---|---|
| Fixed Plan | Scale up only | Not supported | Not supported |
| Serverless Computing | Overflow from VW | Direct routing | Route queue to serverless |
| Adaptive Serverless Computing | Small tasks stay in VW | Large tasks routed automatically | Compatible |
| Auto-scaling | Scale out clusters | N/A | Compatible |
| Scheduled scaling | Scale up/down on schedule | N/A | Compatible |
Requests optimized by Fixed Plan cannot use Serverless Computing or query queues. Address peak loads for these requests by scaling up the virtual warehouse. Write peaks cannot be handled by auto-scaling.
Choose a resource management strategy
Use the following decision tree to select a strategy.
Extremely large workloads (massive data backfills, full table scans, joins across 10+ tables, deeply nested subqueries): use Serverless Computing to prevent these tasks from affecting other workloads. Enable adaptive serverless computing for automatic routing.
Fixed Plan-optimized requests: scale up the virtual warehouse to handle peaks. Serverless Computing and query queues are not available for these requests.
All other requests: choose based on performance requirements, peak patterns, and request type. See the challenges below.
Challenges and solutions
Challenge 1: Resource contention between teams
Problem: ETL pipelines and query workloads compete for the same computing resources, causing interference.
Solution: Deploy multiple virtual warehouses—one per team—to isolate workloads. See Architecture of virtual warehouses.
Example:
Primary virtual warehouse (
init_warehouse): data team for writes and ETLRead-only virtual warehouse 1: analytics team
Read-only virtual warehouse 2: recommendation team
Challenge 2: Fixed peak hours
Problem: Resource demand follows a predictable daily pattern, making it inefficient to maintain peak capacity at all times.
Solution: Use scheduled scaling (beta) to scale virtual warehouses up before peak hours and scale them back down afterward. If resource expansion is needed for fewer than 16 hours per day, scheduled scaling costs less than maintaining dedicated resources around the clock.
Example:
| Team | Peak pattern | Scheduled scaling configuration |
|---|---|---|
| Data team (real-time ETL) | Nightly | Scale up the primary virtual warehouse in the evening; scale down in the morning. Because real-time ETL uses Fixed Plan, scaling up is the only option. |
| Data analysts | Work hours, with occasional night surges | Scale up the read-only virtual warehouse at the start of the workday; scale down in the evening. Handle night surges with auto-scaling (see Challenge 6). |
| Recommendation team | Every evening | Scale up the read-only virtual warehouse before the evening peak; scale down overnight. Point queries use Fixed Plan, so scaling up is the only option. |
Challenge 3: Large tasks causing OOM errors or blocking other tasks
Problem: Large tasks consume excessive resources, triggering OOM errors or blocking smaller tasks for extended periods.
The following workload types are commonly affected.
Batch ETL
A single batch task often requires significant resources and can block the queue for a long time.
| Priority | Solution |
|---|---|
| Stability | Run all batch ETL tasks on Serverless Computing. Configure this at the SQL or user level. See Run read and write tasks with Serverless Computing resources. |
| Balance stability and cost | Run small tasks on the primary virtual warehouse; route large tasks to Serverless Computing. |
Near-real-time ETL with Dynamic Table
Dynamic Table runs an incremental refresh every minute per table. The compute cost per refresh varies with incremental data volume, making it unpredictable.
| Priority | Solution |
|---|---|
| Stability | Run all Dynamic Table refresh tasks on Serverless Computing. See Create dynamic table. |
| Balance stability and cost | Route refresh tasks for large tables or tables with significant data fluctuations to Serverless Computing; run other tasks on the primary virtual warehouse. |
BI dashboards
Many dashboards run simultaneously, and large queries can block smaller ones.
| Priority | Solution |
|---|---|
| Balance stability, cost, and setup effort | Enable Adaptive Serverless Computing at the database or user level. Large tasks are automatically routed to Serverless Computing; small tasks remain in the virtual warehouse. |
| Balance stability and cost | Route specific large queries to Serverless Computing using SQL fingerprints:
|
| Cost savings | Run all queries in the virtual warehouse and enable large query auto-rerun for the query queue. Timed-out and OOM queries automatically rerun on Serverless Computing without affecting the user experience. See Large query control. |
Challenge 4: Sudden large ad-hoc queries destabilize the instance
Problem: Sporadic analytical queries create unpredictable, high resource usage that affects overall instance stability.
| Priority | Solution |
|---|---|
| Stability | Route all ad-hoc analytical requests to Serverless Computing at the user level. See Configure at the user level. |
| Balance stability and cost | Enable Adaptive Serverless Computing at the user level. Large tasks are routed to Serverless Computing automatically; small tasks run in the virtual warehouse. |
| Cost savings | Run all requests in the virtual warehouse and enable large query auto-rerun. See Large query control. |
Challenge 5: Different performance requirements across dashboard users
Problem: BI dashboards are accessed by different roles—data developers, operations staff, sales teams, and senior management—with varying performance expectations.
High-performance requirements
| Priority | Solution |
|---|---|
| Stability | Create a dedicated virtual warehouse for high-performance requests. |
| Balance stability and cost | Route all requests through Serverless Computing, or enable Adaptive Serverless Computing. |
Latency-tolerant workloads
| Request pattern | Solution |
|---|---|
| Fixed (a role consistently accesses the same dashboard) | Configure manual throttling: run a stress test to determine the virtual warehouse's read capacity, then set a fixed concurrency limit for the query queue. See Create a query queue. |
| Variable (multiple dashboards accessed by different roles) | Enable automatic throttling. Hologres automatically adjusts the query queue's concurrency limit based on current workload. See Automatic throttling for query queues (beta). |
Challenge 6: Unexpected request surges
Problem: Sudden, unpredictable traffic spikes—for example, an unexpected surge of nighttime queries—cannot be anticipated by scheduled scaling. Serverless Computing alone cannot handle these surges because the users, tables, and SQL statements involved are unknown in advance.
Solution: Enable auto-scaling. Virtual warehouses automatically scale out during peak loads and scale in when demand subsides. See Multi-cluster and auto scaling (beta).
Advanced settings
Configure task priorities for Serverless Computing
When multiple services share Serverless Computing resources, set priorities at the user level to control which tasks run first under resource contention. See Set priorities for Serverless Computing tasks.
Example:
| User type | Priority | Behavior when resources are scarce |
|---|---|---|
| Roles with high performance requirements | 5 | Executed first |
| Batch ETL tasks | 1 | Wait in queue |
| All other tasks | 3 (default) | Standard scheduling |
Configure daily quotas for Serverless Computing
If multiple services use Serverless Computing, costs can be unpredictable. Set daily quotas at the instance level and per user to control costs. See Daily Usage Limit.
Enable high availability for read-only virtual warehouses
For read-only virtual warehouses that serve latency-sensitive workloads—especially online recommendation services—configure multiple shard-level replicas. If a query node fails, queries continue without data loss. See Concurrent queries with shard replicas.