All Products
Search
Document Center

Container Service for Kubernetes:Recommended ECS instance specifications

Last Updated:Mar 26, 2026

Select appropriate Elastic Compute Service (ECS) instance types for your cluster nodes to ensure cluster stability and reliability. This topic describes the recommended ECS instance specifications for creating an Alibaba Cloud Container Service for Kubernetes (ACK) cluster.

Instance types not supported by ACK

Before selecting instance types, check whether they are compatible with ACK. The following instance types cannot be used as worker or master nodes.

General restrictions

Unsupported instance family Example Reason
t5, burstable ecs.t5-lc2m1.nano Unstable performance may cause cluster instability
t6, burstable ecs.t6-c4m1.large Unstable performance may cause cluster instability
Instance types with fewer than 4 vCPU cores ecs.g6.large Specs too low for stable cluster operation
c6t, security-enhanced compute-optimized ecs.c6t.large Not supported
g6t, security-enhanced general-purpose ecs.g6t.large Not supported
Super Computing Cluster (SCC) ecs.sccg7.32xlarge Not supported
To use low-specification ECS instance types for clusters and node pools, submit a request in Quota Center.

For GPU-accelerated workloads, see GPU-accelerated instance families supported by ACK.

Terway network plugin restrictions

If you use the Terway network plugin, the maximum number of pods per node depends on the number of Elastic Network Interfaces (ENIs) the instance type supports. Instance types that cannot meet the minimum pod count threshold cannot be used.

  • Shared ENI mode or Shared ENI + Trunk ENI mode: The pod limit per node must exceed 11. Formula: (Number of ENIs - 1) × Number of private IPs per ENI > 11 Example: ecs.g6.large supports 2 ENIs with 6 private IPv4 addresses each. Pod limit = (2 - 1) × 6 = 6. This instance type cannot be used.

  • Exclusive ENI mode: The pod limit per node must exceed 6. Formula: Number of ENIs - 1 > 6 Example: ecs.g6.xlarge supports 3 ENIs. Pod limit = 3 - 1 = 2. This instance type cannot be used.

For a complete list of compatible instance types by Terway mode, see Use the Terway network plugin.

Why fewer, larger instances perform better

Using many small ECS instances creates compounding problems at scale. The system reserves CPU, memory, and disk resources on every node for cluster management components. On small instances, this reservation consumes a significant share of total capacity, leaving less room for workloads.

Resource fragmentation compounds this problem. After a container is allocated CPU and memory, the remaining resources on a small instance may be too small to run another container. Those resources sit idle but cannot be reclaimed.

Large instances address both problems:

  • Better network efficiency: More containers communicate within a single instance, reducing cross-node traffic. Large instances also provide higher network bandwidth for bandwidth-intensive applications.

  • Faster image pulls: On a large instance, a container image is pulled once and shared across all containers on that node. With many small instances, the same image must be pulled once per instance, slowing scale-out.

Select worker node specifications

Minimum specifications

Use instance types with at least 4 CPU cores and 8 GB of memory.

Sizing for fault tolerance

Calculate the total CPU cores required for your daily workload, then size nodes to absorb instance failures without service disruption.

Example:

Fault tolerance target Node count Node size Max operating load
10% (one node can fail) at least 10 16 CPU cores 144 cores (160 × 90%)
20% (one node can fail) at least 5 32 CPU cores 128 cores (160 × 80%)

If one instance fails in either configuration, the remaining instances continue handling the peak load.

CPU-to-memory ratio

Select the ratio that matches your workload type:

  • 1:2 or 1:4: General-purpose workloads

  • 1:8: Memory-intensive applications such as Java services

GPU workloads

To maintain service stability and ensure accurate resource scheduling, do not mix GPU-accelerated and non-GPU instance types in the same node pool.

Persistent memory-optimized instances

Instance families such as re6p use a hybrid memory architecture combining regular memory and persistent memory. To enable persistent storage on these nodes, see Non-volatile memory volumes. For more information, see Instance families.

Large-scale clusters: ECS Bare Metal Instances

At approximately 1,000 CPU cores of daily scale, use ECS Bare Metal Instances. A single ECS Bare Metal Instance provides at least 96 CPU cores, so a 1,000-core cluster requires only 10–11 instances. For more information, see ECS Bare Metal Instances.

Select master node specifications

Master nodes run etcd, kube-apiserver, and kube-controller. For production ACK dedicated clusters, master node specifications must match cluster scale — larger clusters require higher-spec master nodes.

Cluster size in this topic is measured by the number of nodes. In practice, cluster size can also be measured by pod count, deployment frequency, or request volume.

Use small instances for personal testing and learning only. For production clusters, select master node specifications based on the following table.

Number of nodes Recommended master node specifications
1–5 4 CPU cores, 8 GB memory (2 cores/4 GB or lower not recommended)
6–20 4 CPU cores, 16 GB memory
21–100 8 CPU cores, 32 GB memory
100–200 16 CPU cores, 64 GB memory
200–500 (assess blast radius risk) 64 CPU cores, 128 GB memory

ECS Bare Metal Instances

ECS Bare Metal Instances are built on Alibaba Cloud's virtualization 2.0 technology. They combine the elasticity of virtual machines with the performance and features of physical servers, and support nested virtualization.

ECS Bare Metal Instances are suited for dedicated compute resources, encrypted computing, and hybrid cloud deployments. For an overview and supported instance families, see Overview of ECS Bare Metal Instances.

When to use ECS Bare Metal Instances:

  • Large cluster scale: At approximately 1,000 CPU cores of daily scale, each ECS Bare Metal Instance contributes at least 96 CPU cores, so you can build the cluster with only 10–11 instances.

  • Traffic spikes requiring rapid scale-out: ECS Bare Metal Instances deliver better performance than equivalently-spec'd physical servers and can provide millions of vCPUs to handle sudden traffic increases — for example, during e-commerce sales promotions.