Set ECI Pod vCPU and GPU Resources for Optimal Performance - ACK Serverless

How it works

When you create an ECI pod, specify compute resources using the k8s.aliyun.com/eci-use-specs annotation. ECI selects a matching instance and runs your pod on it. After the pod starts, check the k8s.aliyun.com/eci-instance-spec field in the YAML file of the pod to see which specification was actually used — this determines your billing.

If ECI uses an ECS instance type, you are billed based on that instance type.
If ECI uses vCPU and memory specs, you are billed based on the vCPU count and memory size.

Limitations

Before creating ECI pods, review these constraints:

GPU, local disk, and Arm: If your pod requires GPU acceleration, local disks, or Arm architecture, specify only the corresponding instance types. Mixing these with non-matching specs is not supported.
Preemptible instances: Certain vCPU/memory specifications (available only in some regions) cannot be used to create preemptible instances.
Architecture: ECI defaults to the x86 architecture. For Arm-based pods, see Schedule pods to an Arm-based virtual node.
Annotation placement: Add k8s.aliyun.com/eci-use-specs to metadata in the pod spec. In a Deployment, add it to spec.template.metadata.

Specify compute resources using `k8s.aliyun.com/eci-use-specs`

Use this annotation to declare the compute resources your pod needs. ECI attempts each specification in priority order (first to last) until the pod is created successfully.

Annotation key	Required	Value format	Notes
`k8s.aliyun.com/eci-use-specs`	Yes	Comma-separated list of specs	Up to 5 specs. Specify vCPU/memory (e.g., `2-4Gi`), ECS instance types (e.g., `ecs.c6.large`), or a mix of both.

Example 1: Specify GPU instance types

Use GPU instance types and set the nvidia.com/gpu resource limit per container. GPUs are shared across containers in the pod.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: test
  labels:
    app: test
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      name: nginx-test
      labels:
        app: nginx
        alibabacloud.com/eci: "true"
      annotations:
        k8s.aliyun.com/eci-use-specs: "ecs.gn6i-c4g1.xlarge,ecs.gn6i-c8g1.2xlarge" # Specify a maximum of five GPU-accelerated ECS instance types at a time.
    spec:
      containers:
      - name: nginx
        image: registry.cn-shanghai.aliyuncs.com/eci_open/nginx:1.14.2
        resources:
            limits:
              nvidia.com/gpu: "1" # The number of GPUs required by the Nginx container. The GPUs are shared.
        ports:
        - containerPort: 80
      - name: busybox
        image: registry.cn-shanghai.aliyuncs.com/eci_open/busybox:1.30
        command: ["sleep"]
        args: ["999999"]
        resources:
            limits:
              nvidia.com/gpu: "1" # The number of GPUs required by the BusyBox container. The GPUs are shared.

Example 2: Mix vCPU/memory specs and ECS instance types

List specs in priority order. ECI tries them left to right until one succeeds.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: test
  labels:
    app: test
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      name: nginx-test
      labels:
        app: nginx
        alibabacloud.com/eci: "true"
      annotations:
        k8s.aliyun.com/eci-use-specs: 2-4Gi,ecs.c5.large,ecs.c6.large  # Sets the specifications that you want to use to create the pod. Replace the value by using the actual specifications.
    spec:
      containers:
      - name: nginx
        image: registry.cn-shanghai.aliyuncs.com/eci_open/nginx:1.14.2
        ports:
        - containerPort: 80

Verify the spec used

After the pod starts, check the k8s.aliyun.com/eci-instance-spec field in the YAML file of the pod to confirm which specification ECI actually allocated and verify your billing basis.

Supported compute specifications

vCPU and memory specifications

Available in all regions

vCPU	Memory (GiB)	Bandwidth (bidirectional, Gbit/s, theoretical upper limit)
0.25	0.5 and 1	0.08
0.5	1 and 2	0.08
1	2, 4, and 8	0.1
2	1, 2, 4, 8, and 16	1
4	2, 4, 8, 16, and 32	1.5
8	4, 8, 16, 32, and 64	2
12	12, 24, 48, and 96	2.5
16	16, 32, 64, and 128	3
24	24, 48, 96, and 192	4.5
32	32, 64, 128, and 256	6
52	96, 192, and 384	12.5
64	128, 256, and 512	20

Available in some regions only

Important

These specs cannot be used for preemptible instances. Supported regions: China (Hangzhou), China (Shanghai), China (Qingdao), China (Beijing), China (Zhangjiakou), China (Hohhot), China (Ulanqab), China (Shenzhen), China (Heyuan), China (Guangzhou), China (Chengdu), and Singapore. If the spec is not available in the selected region and zone, the pod cannot be created.

vCPU	Memory (GiB)	Bandwidth (bidirectional, Gbit/s, theoretical upper limit)
2	6, 10, 12, and 14	1
4	6, 10, 12, 14, 18, 20, 22, 24, 26, 28, and 30	1.5
6	6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, and 48	1.5
8	10, 12, 14, 18, 20, 22, 24, 26, 28, 30, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, and 62	2.5

ECS instance families

x86 enterprise-level

x86 instance types use the x86 architecture, with each vCPU corresponding to one hyper-thread of a processor core. They deliver stable performance and suit enterprise applications, databases, video encoding/decoding, and data analysis.

Category	Instance families
General-purpose	g8a, g8i, g7a, g7, g6e, g6a, g6, g5, sn2, and sn2ne
Compute-optimized	c8a, c8i, c7a, c7, c6e, c6a, c6, c5, sn1, and sn1ne
Memory optimized	r8a, r8i, r7a, r7, r6e, r6a, r6, r5, se1ne, and se1
General compute	u1
Compute-intensive	ic5
High clock speed	hfg8i, hfg7, hfg6, hfg5; hfc8i, hfc7, hfc6, hfc5; hfr8i, hfr7
Big data	d1, d1ne
Local SSDs	i2, i2g

GPU-accelerated

GPU-accelerated instance types contain GPUs and suit deep learning and image processing workloads. GPU-related container images run directly on ECI GPU instances. A NVIDIA GPU driver is pre-installed — supported driver and CUDA versions vary by GPU type.

The gn8ia and gn8is instance families are available only in specific regions outside the Chinese mainland. Contact Alibaba Cloud sales to use them.

Category	Instance families	Driver and CUDA versions
vGPU-accelerated	sgn7i-vws	NVIDIA 470.161.03, CUDA 11.4
	vgn7i-vws	—
	vgn6i-vws	—
GPU compute-optimized	gn7e	NVIDIA 470.82.01 + CUDA 11.4 (default); NVIDIA 535.161.08 + CUDA 12.2
	gn7i, gn7s, gn7, gn6v, gn6e, gn6i, gn5i, gn5	—
	gn8ia, gn8is	NVIDIA 535.161.08, CUDA 12.2

Arm-based enterprise-level

Arm instance types use the ARM architecture, with each vCPU corresponding to a physical core. They provide exclusive resources with stable performance and suit containers, microservices, website servers, high-performance computing, and CPU-based machine learning.

Category	Instance families
General-purpose	g8y
Compute-optimized	c8y
Memory optimized	r8y

x86 shared computing

Shared instance types are suitable for small and medium-sized websites and individuals. Compared with enterprise-level instance types, shared instance types emphasize sharing of resource performance to maximize resource utilization. As a result, the stability of computing performance cannot be ensured, but the cost is reduced.

Category	Instance families
Economy	e

For details on instance families, pricing, and regional availability, see:

Choose a creation method

ECI supports three creation methods. Choose based on your resource and billing requirements.

Creation method	Billing basis	Use when
Specify vCPU and memory size	vCPU + memory specs	You want flexible resource sizing and pay based on what you request. If you specify unsupported vCPU/memory combinations, the system adjusts the specifications and charges you based on the new specifications.
Specify an ECS instance type	ECS instance type	You need specific hardware capabilities, such as GPU acceleration or local disks
Specify ECS instance families or generations as filters (with vCPU/memory)	Actual ECS instance type selected by ECI	You want vCPU/memory flexibility but need to constrain which instance families ECI selects

For detailed steps for each method, see:

Reduce costs

Preemptible instances

Preemptible instances can reduce costs significantly for stateless and fault-tolerant workloads. See Create a preemptible elastic container instance.

Reserved instances and savings plans

For long-running stable workloads, offset your ECI bills with reserved instances or savings plans:

vCPU/memory-billed pods: Use general-purpose savings plans.
ECS instance type-billed pods: Use general-purpose savings plans, ECS compute savings plans, or reserved instances.

See Use reserved instances and Use savings plans.

Handle insufficient resources

ECI draws from regional cloud resource pools. When creating many pods, some regions or zones may run low on resources. To improve pod creation success rate, specify multiple specs and multiple vSwitches across different zones:

Container Service for Kubernetes:Specify the compute specification when creating an elastic container instance-based pod

How it works

Limitations

Specify compute resources using `k8s.aliyun.com/eci-use-specs`

Example 1: Specify GPU instance types

Example 2: Mix vCPU/memory specs and ECS instance types

Verify the spec used

Supported compute specifications

vCPU and memory specifications

Available in all regions

Available in some regions only

ECS instance families

x86 enterprise-level

GPU-accelerated

Arm-based enterprise-level

x86 shared computing

Choose a creation method

Reduce costs

Preemptible instances

Reserved instances and savings plans

Handle insufficient resources

What's next

How it works

Limitations

Specify compute resources using k8s.aliyun.com/eci-use-specs

Example 1: Specify GPU instance types

Example 2: Mix vCPU/memory specs and ECS instance types

Verify the spec used

Supported compute specifications

vCPU and memory specifications

Available in all regions

Available in some regions only

ECS instance families

x86 enterprise-level

GPU-accelerated

Arm-based enterprise-level

x86 shared computing

Choose a creation method

Reduce costs

Preemptible instances

Reserved instances and savings plans

Handle insufficient resources

What's next

Specify compute resources using `k8s.aliyun.com/eci-use-specs`