Elastic Remote Direct Memory Access (eRDMA) is a high-performance RDMA network service built on the fourth-generation X-Dragon architecture and VPC networks. It delivers low latency and high throughput for containerized workloads, and is fully compatible with the RDMA ecosystem at scale.
This topic describes how to install the ACK eRDMA Controller and enable eRDMA acceleration in pods, with step-by-step guidance for two common scenarios: GPU distributed training with NCCL and transparent TCP acceleration with SMC-R.
Prerequisites
Before you begin, ensure that you have:
-
An ACK cluster running Kubernetes 1.20 or later. To upgrade, see Upgrade a cluster.
-
Nodes that support elastic RDMA Interface (ERI) added to a node pool. ERIs can only be bound to ECS instances in specific instance families. For supported instance families, see Instance family overview.
How it works
The ACK eRDMA Controller manages eRDMA-capable elastic network interfaces (ENIs) attached to cluster nodes. It registers eRDMA devices with the kubelet as the aliyun/erdma extended resource, configures routes for those ENIs, and exposes the devices to pods through Kubernetes resource requests.
Once installed, pods request eRDMA access by declaring aliyun/erdma in their resource limits. To additionally accelerate existing TCP connections without code changes, enable SMC-R through a pod annotation.
| Component | Role |
|---|---|
| ACK eRDMA Controller | Registers eRDMA devices as extended resources; manages ENI routes |
eRDMA driver (default / compat / ofed) |
Kernel-level driver installed on each node |
aliyun/erdma resource |
Kubernetes extended resource that pods request to get device access |
| SMC-R | Linux protocol that transparently replaces TCP with RDMA transport |
Step 1: Install the ACK eRDMA Controller
If your cluster uses the Terway network plugin, configure a whitelist for Terway before or after installing the controller. The whitelist prevents Terway from modifying eRDMA-capable ENIs. For details, see Configure a whitelist for ENIs.
-
On the Clusters page, click the name of your cluster. In the left navigation pane, click Add-ons.
-
On the Add-ons page, click the Network tab. Find ACK eRDMA Controller and follow the prompts to configure and install it. Configure the following settings during installation:
NoteWhen a node has multiple NICs, the ACK eRDMA Controller assigns routes for eRDMA-capable ENIs a lower priority than routes for NICs in the same CIDR block. The default route priority is
200. If you configure NIC routes manually after installation, avoid route conflicts with this priority range.Setting Description preferDriver (driver type) The eRDMA driver mode for cluster nodes. Choose based on your workload: defaultfor standard acceleration,compatfor RoCE-compatible environments,ofedfor GPU-accelerated instance types running NCCL. For details, see Enable eRDMA.Specifies whether to assign all eRDMA devices of nodes to pods True: allocates all eRDMA devices on the node to the pod.False: allocates one eRDMA device based on NUMA topology. When set toFalse, the node must have the static CPU management policy enabled to guarantee fixed NUMA allocation. See Create and manage node pools. -
Verify that the controller is running. In the left navigation pane, go to Workloads > Pods. Set the namespace filter to
ack-erdma-controllerand confirm all pods show a running status.
Step 2: Enable eRDMA in pods
After the ACK eRDMA Controller is installed, use the following configurations in your pod spec to enable eRDMA acceleration.
Enable eRDMA device access
Declare the aliyun/erdma resource in the container's resource limits:
spec:
containers:
- name: erdma-container
resources:
limits:
aliyun/erdma: 1
After the pod starts, verify that the eRDMA devices are available:
/# ls /dev/infiniband/
rdma_cm uverbs0
Enable transparent TCP acceleration with SMC-R
SMC-R transparently accelerates TCP connections using RDMA — no application code changes required. After enabling eRDMA device access, add the following annotation to the pod:
metadata:
annotations:
network.alibabacloud.com/erdma-smcr: "true"
-
Both ends of the TCP connection must have SMC-R enabled.
-
Supported only on Alibaba Cloud Linux 3 with kernel version 5.10.134-17 or later. See Alibaba Cloud Linux 3 image release notes.
-
Not supported when
preferDriveris set toofedorcompat. -
eRDMA and SMC-R do not support IPv6. If the application uses IPv6 addresses, SMC-R falls back to the TCP stack.
Scenarios
Scenario 1: Accelerate NCCL communication for GPU distributed training
Use this scenario for GPU-accelerated instance types running distributed training workloads. All pods use the ofed driver and request both GPU and eRDMA resources.
-
Install the ACK eRDMA Controller with
preferDriverset toofed. -
Add GPU nodes to the node pool. See Create and manage node pools.
-
Install eRDMA-related packages in your container image at build time. For Debian or Ubuntu (replace
{OS|ubuntu}and{Version|focal}with your OS name and version):wget -qO - https://mirrors.aliyun.com/erdma/GPGKEY | apt-key add - \ && echo "deb [ arch=amd64 ] https://mirrors.aliyun.com/erdma/apt/{OS|ubuntu} {Version|focal}/erdma main" \ | tee /etc/apt/sources.list.d/erdma.list \ && apt update \ && apt install -y libibverbs1 ibverbs-providers ibverbs-utils librdmacm1For Alibaba Cloud Linux or RHEL:
cat > /etc/yum.repos.d/erdma.repo <<EOF [erdma] name = ERDMA Repository baseurl = http://mirrors.aliyun.com/erdma/yum/redhat/7/erdma/x86_64/ gpgcheck = 0 enabled = 1 EOF yum install --disablerepo=* --enablerepo erdma -y libibverbs ibverbs-providers ibverbs-utils librdmacm -
Deploy the GPU application. The following StatefulSet template runs
nccl-testacross two replicas, each with 8 GPUs and one eRDMA device:apiVersion: apps/v1 kind: StatefulSet metadata: name: nccltest spec: selector: matchLabels: app: nccltest serviceName: "nccltest" replicas: 2 template: metadata: labels: app: nccltest spec: hostNetwork: true dnsPolicy: ClusterFirstWithHostNet containers: - env: - name: NCCL_SOCKET_IFNAME value: "eth0" - name: NCCL_DEBUG value: "INFO" - name: NCCL_IB_GID_INDEX value: "1" image: <nccl-test-image-with-erdma> imagePullPolicy: Always name: nccltest securityContext: privileged: true resources: limits: nvidia.com/gpu: "8" aliyun/erdma: "1" requests: nvidia.com/gpu: "8" aliyun/erdma: "1" -
Verify that NCCL is using eRDMA. Check the application logs — the expected output lists the eRDMA devices NCCL is using for communication. The log confirms that
erdma_0anderdma_1are active, indicating eRDMA is accelerating the NCCL communication.
Scenario 2: Transparently accelerate application networks with SMC-R
Use this scenario for standard TCP-based workloads where you want eRDMA acceleration without modifying application code. Set preferDriver to default.
-
Install the ACK eRDMA Controller with
preferDriverset todefault. -
Deploy the application with the SMC-R annotation. The following Deployment enables both eRDMA device access and transparent SMC-R acceleration:
apiVersion: apps/v1 kind: Deployment metadata: labels: app: app-with-erdma name: app-with-erdma spec: replicas: 2 selector: matchLabels: app: app-with-erdma template: metadata: labels: app: app-with-erdma annotations: network.alibabacloud.com/erdma-smcr: "true" spec: containers: - image: <application image> imagePullPolicy: Always name: app-with-erdma resources: limits: aliyun/erdma: 1 -
Verify the acceleration status. Install
smc-toolsin the container and runsmcss:/# smcss State UID Inode Local Address Peer Address Intf Mode ACTIVE 00000 0059964 172.17.192.73:47772 172.17.192.10:80 0000 SMCRSMCRin the Mode column confirms the connection is using eRDMA acceleration. If the column showsTCP, check that both the client and server pods have the SMC-R annotation set to"true".
What's next
-
ACK eRDMA Controller — component reference and configuration details
-
Enable eRDMA — driver type selection guide
-
Create and manage node pools — add eRDMA-capable nodes and configure CPU policies