Choose ECS instance types for ACK worker and master nodes by workload, fault tolerance, and cluster scale.
Instance types not supported by ACK
The following instance types cannot be used as ACK worker or master nodes.
General restrictions
| Unsupported instance family | Example | Reason |
|---|---|---|
| t5, burstable | ecs.t5-lc2m1.nano | Unstable performance may cause cluster instability |
| t6, burstable | ecs.t6-c4m1.large | Unstable performance may cause cluster instability |
| Instance types with fewer than 4 vCPU cores | ecs.g6.large | Specs too low for stable cluster operation |
| c6t, security-enhanced compute-optimized | ecs.c6t.large | Not supported |
| g6t, security-enhanced general-purpose | ecs.g6t.large | Not supported |
| Super Computing Cluster (SCC) | ecs.sccg7.32xlarge | Not supported |
To use low-specification instance types, submit a request in Quota Center.
For GPU workloads, see GPU-accelerated instance families supported by ACK.
Terway network plugin restrictions
With Terway, pod capacity per node depends on the number of elastic network interfaces (ENIs) the instance type supports. Instance types below the minimum pod threshold are incompatible.
-
Inclusive ENI mode or Shared ENI + Trunk ENI mode: Pod limit per node must exceed 11. Formula:
(Number of ENIs - 1) × Number of private IPs per ENI > 11Example: ecs.g6.large supports 2 ENIs with 6 private IPv4 addresses each. Pod limit =(2 - 1) × 6 = 6. This type is incompatible. -
Exclusive ENI mode: Pod limit per node must exceed 6. Formula:
Number of ENIs - 1 > 6Example: ecs.g6.xlarge supports 3 ENIs. Pod limit =3 - 1 = 2. This type is incompatible.
For compatible instance types by Terway mode, see Use the Terway network plugin.
Why fewer, larger instances perform better
Every node reserves CPU, memory, and disk for cluster management. On small instances, these reservations consume a larger share of total capacity, leaving less room for workloads.
Fragmentation compounds this: after allocating resources to a container, the remainder on a small instance may be too small for another container and sit idle.
Large instances address both problems:
-
Better network efficiency: More containers share a single instance, reducing cross-node traffic. Large instances also provide higher network bandwidth.
-
Faster image pulls: A container image is pulled once per node and shared across containers. With many small instances, the same image is pulled on each, slowing scale-out.
Select worker node specifications
Minimum specifications
Use instance types with at least 4 CPU cores and 8 GB memory.
Sizing for fault tolerance
Calculate the total CPU cores required for your daily workload, then size nodes to absorb instance failures without service disruption.
Example:
| Fault tolerance target | Node count | Node size | Max operating load |
|---|---|---|---|
| 10% (one node can fail) | at least 10 | 16 CPU cores | 144 cores (160 × 90%) |
| 20% (one node can fail) | at least 5 | 32 CPU cores | 128 cores (160 × 80%) |
If one instance fails, the remaining instances continue handling the peak load.
CPU-to-memory ratio
Select the ratio that matches your workload type:
-
1:2 or 1:4: General-purpose workloads
-
1:8: Memory-intensive applications such as Java services
GPU workloads
For stable scheduling, do not mix GPU and non-GPU instance types in the same node pool.
Persistent memory-optimized instances
Instance families such as re6p use a hybrid architecture combining regular and persistent memory. To enable persistent storage, see Non-volatile memory volumes. See Instance families for supported types.
Large-scale clusters: ECS Bare Metal Instances
At approximately 1,000 CPU cores of daily scale, use ECS Bare Metal Instances. A single instance provides at least 96 CPU cores, so a 1,000-core cluster requires only 10–11 nodes. See ECS Bare Metal Instances.
Select master node specifications
Master nodes run etcd, kube-apiserver, and kube-controller. In production ACK dedicated clusters, size master nodes to match cluster scale.
Cluster size here is measured by node count. In practice, pod count, deployment frequency, or request volume are also valid sizing metrics.
Use small instances for testing only. For production clusters, select master node specifications from the following table.
| Number of nodes | Recommended master node specifications |
|---|---|
| 1–5 | 4 CPU cores, 8 GB memory (2 cores/4 GB or lower not recommended) |
| 6–20 | 4 CPU cores, 16 GB memory |
| 21–100 | 8 CPU cores, 32 GB memory |
| 100–200 | 16 CPU cores, 64 GB memory |
| 200–500 (assess blast radius risk) | 64 CPU cores, 128 GB memory |
ECS Bare Metal Instances
Built on Alibaba Cloud's virtualization 2.0, ECS Bare Metal Instances combine VM elasticity with physical server performance and support nested virtualization.
ECS Bare Metal Instances suit dedicated compute, encrypted computing, and hybrid cloud deployments. See Overview of ECS Bare Metal Instances for supported instance families.
When to use ECS Bare Metal Instances:
-
Large cluster scale: At approximately 1,000 CPU cores of daily scale, each instance provides at least 96 cores, requiring only 10–11 nodes.
-
Traffic spikes requiring rapid scale-out: ECS Bare Metal Instances outperform equivalently-spec'd physical servers and can scale to millions of vCPUs for sudden traffic spikes, such as e-commerce promotions.