This topic introduces Elastic Remote Direct Memory Access (eRDMA) and describes the benefits, common scenarios, and specifications of eRDMA.
Introduction
What is eRDMA?
eRDMA is an elastic Remote Direct Memory Access (RDMA) network developed by Alibaba Cloud for the cloud. eRDMA reuses virtual private clouds (VPCs) as the underlying link and uses a congestion control (CC) algorithm that is developed by Alibaba Cloud. Compared with traditional RDMA networks, eRDMA features high throughput and low latency and supports RDMA networking on a large scale within seconds. eRDMA is compatible with traditional high-performance computing (HPC) applications and Transmission Control Protocol/Internet Protocol (TCP/IP) applications.
You can use eRDMA as the basis and deploy HPC applications in the cloud to obtain high-performance application clusters that have high elasticity at low costs. You can also replace a VPC with the eRDMA network to accelerate the performance of other applications.
How to implement the capabilities of eRDMA
The capabilities of eRDAM must be implemented based on the types of instances that support eRDMA. You can create and bind eRDMA-capable elastic network interfaces (ENIs) to implement the capabilities of eRDMA.
Elastic RDMA Interfaces (ERIs) are virtual network interfaces that can be bound to ECS instances. ERIs must depend on ENIs to enable RDMA devices. An ERI reuses the network to which an ENI belongs. This allows you to use the RDMA feature in the original network and enjoy the low latency provided by RDMA without the need to modify service networking.
Benefits
eRDMA provides the following benefits:
High performance
RDMA transfers data from user-mode programs to Host Channel Adapter (HCA) for network transmission by bypassing the kernel stack. This greatly reduces CPU load and latency. eRDMA provides the benefits of traditional RDMA interfaces and applies traditional RDMA technology to VPCs. The ultra-low latency provided by eRDMA allows you to enjoy the benefits of RDMA in cloud networks.
Inclusiveness
You can enable eRDMA for free. To enable eRDMA, you need to only select eRDMA when you purchase an ECS instance. This feature is free of charge.
Large-scale deployment
Traditional RDMA is based on lossless networks. This makes large-scale deployment costly and difficult. eRDMA allows transmission quality changes in VPCs, such as delays and packet losses, by using the Alibaba Cloud-developed CC algorithm. Therefore, eRDMA can ensure good performance in lossy networks.
Scalability
Compared with traditional RDMA interfaces that require separate network cards, eRDMA is based on the SHENLONG architecture and is an RDMA HCA card that can be used in the cloud. When you use ECS, you can dynamically add devices, perform hot migration, and deploy eRDMA in a flexible manner.
Shared VPCs
eRDMA depends on ENIs and reuses networks to which ENIs belong. This allows you to activate the RDMA feature in original networks without the need to modify service networking.
Common scenarios
The TCP/IP protocol is the mainstream network communication protocol based on which many applications are built. With the development of business that is related to data centers, higher requirements are imposed on network performance, such as lower delays and higher throughput. TCP/IP has become a bottleneck that restricts the performance of communication networks due to its limits such as high copy costs, high protocol stack processing, complicated CC algorithm, and frequent context switch.
RDMA helps solve the preceding pain points. RDMA provides features, such as zero-copy and kernel bypass, to prevent costs in copy and frequent context switch. Compared with TCP/IP, RDMA features low latency, high throughput, and low CPU utilization. However, RDMA has few users due to high prices and O&M costs.
eRDMA provided by Alibaba Cloud is designed to provide inclusive capabilities for the cloud. eRDMA meets requirements on low latency and can be used in common scenarios. This way, you can have better user experience in the cloud. Inclusive RDMA networks can be used in a wide range of scenarios. Compared with traditional RDMA, eRDMA can be used in more fields, such as cache databases, big data, HPC, and AI training. Considerable performance gains brought by eRDMA are yielded in the preceding fields.
Limits
Before you use eRDMA, make sure that the following limits are met:
ECS instance: For information about limits on ECS instances, see Configuring eRDMA on enterprise-level instances.
Basic specifications
This section describes the specifications of eRDMA. When you use eRDMA, make sure that the specification limits are met. Otherwise, your application cannot work as expected.
RDMA QP
Specification | Content | Description |
Maximum QPs (max_qp_num) | Up to 131,071 queue pairs (QPs) are supported. | The maximum number of QPs varies based on the instance type. |
Maximum outstanding WRs to the send queue (max_send_wr) | 8,192 | The maximum number of outstanding work requests (WRs) that can be posted to the send queue. |
Maximum outstanding WRs to the receive queue (max_recv_wr) | 32,768 | The maximum number of outstanding WRs that can be posted to the receive queue. |
Maximum SGEs in a send WR (max_send_sge) | 6 | The maximum number of scatter-gather elements (SGEs) in a send WR. |
Maximum SGEs in a receive WR (max_recv_sge) | 1 | The maximum number of SGEs in a receive WR. |
SRQ | Not supported | None. |
QP type | Reliable connected (RC) | None. |
Connection establishment method | RDMA_CM | None. |
RDMA CQ
Specification | Content | Description |
CQs | The number of completion queues (CQs) varies based on the instance type. The maximum number of CQs is twice the number of QPs. | None. |
Vectors in a CQ (vector_num) | The number of vectors in a CQ varies based on the instance type. The maximum number of vectors in a CQ is 31. The number of CPUs is related to the number of QPs. |
|
Maximum CEQ depth | 256 |
|
Maximum CQ depth | 1,048,576 | None. |
RDMA MR
Specification | Content |
MRs | The number of memory regions (MRs) vary based on the instance type. The maximum number of MRs is twice the number of QPs. |
MWs | Not supported |
Max MR size | The size of MRs varies based on the underlying hardware. The minimum supported MR size is 2 GB and the maximum supported MR size is 64 GB. |
Supported RDMA Verbs opcode
Opcode | Supported |
RDMA Write | Supported |
RDMA Write with Immediate | Supported |
RDMA Read | Supported |
Send | Supported |
Send with Invalidate | Supported |
Send with Immediate | Supported |
Send with Solicited Event | Supported |
Local Invalidate | Only kernel-mode Verbs opcode is supported. |
Atomic Operation | Not supported |