All Products
Search
Document Center

Elastic Compute Service:Configure eRDMA on an enterprise-level instance

Last Updated:Mar 25, 2024

You can configure elastic Remote Direct Memory Access (eRDMA) on eRDMA-capable, enterprise-level Elastic Compute Service (ECS) instances to use the low-latency, high-throughput, high-performance, and highly scalable RDMA network service and improve network performance without the need to modify the current network architecture. This topic describes how to configure eRDMA on an enterprise-level ECS instance.

Familiarize yourself with the following terms: eRDMA interface (ERI), elastic network interface (ENI), and eRDMA.

  • An ERI is an ENI for which the ERI feature is enabled. ERIs are virtual network interfaces that can be bound to ECS instances. For more information, see the Introduction section in the "Overview" topic.

  • ENIs are virtual network interfaces in virtual private clouds (VPCs) that are used to connect ECS instances to VPCs. For more information, see Overview of ENIs.

  • eRDMA is a low-latency, high-throughput, high-performance, and highly scalable RDMA network service provided by Alibaba Cloud. For more information, see Overview of eRDMA.

Limits

The following table describes the limits on eRDMA based on regions, instance families, images, and hot swapping.

Item

Description

Region

eRDMA is supported in the following regions: China (Beijing), China (Shanghai), China (Hangzhou), and China (Shenzhen), China (Guangzhou), China (Ulanqab), and China (Heyuan).

Instance family

The following instance families support eRDMA:

Image

  • Alibaba Cloud Linux 3 (recommended)

  • Alibaba Cloud Linux 2 for x86

  • CentOS 7.9 for x86

  • Ubuntu 18.04, 20.04, and 22.04

  • Anolis OS 8.4 ANCK for Arm and Anolis OS 8.6 ANCK for Arm

Note

The images that are available for selection vary based on the instance type. The images that are available for selection are displayed on the instance buy page when you select an instance type that supports eRDMA.

Number of eRDMA devices

Each ECS instance supports only one ERI.

Hot swapping

ERIs can be hot-swapped in and cannot be hot-swapped out.

Network

  • ERIs do not support IPv6 addresses.

  • When two ECS instances communicate over eRDMA, the communication path cannot span across network elements, such as Server Load Balancer (SLB) instances.

Procedure

You can enable eRDMA on an eRDMA-capable instance only if the instance meets the following conditions: The eRDMA driver is installed on the instance, and an ERI is bound to the instance.

Configure eRDMA when you create an instance

Important

When you create an eRDMA-capable instance that runs Alibaba Cloud Linux, Ubuntu, or Anolis OS, you can enable eRDMA by selecting the Auto-install eRDMA Driver option to automatically install the eRDMA driver and enabling the ERI feature for the primary ENI. If you cannot select the Auto-install eRDMA Driver option for the operating system version that you select or the eRDMA driver fails to be automatically installed, you can install the driver manually or by using a script after the instance is created. For more information, see the Configure eRDMA on an existing instance section in this topic.

  1. Go to the ECS instance buy page.

  2. Create an ECS instance that supports ERIs. When you create the ECS instance, take note of the following parameters or options. For information about other parameters on the ECS instance buy page, see Create an instance on the Custom Launch tab.

    • Instance and Image: Select an instance type that supports eRDMA and an image. For more information, see the Limits section in this topic.

    • Auto-install eRDMA Driver: Select this option. Then, the eRDMA driver is automatically installed during ECS instance creation.

      image.png

    • ENI: Select eRDMA Interface to the right of the primary ENI.

      Important
      • ERIs do not support IPv6 addresses.

      • When you create an ECS instance, you can enable the ERI feature only for the primary ENI. You can bind only one ERI to each ECS instance. If you want to use a secondary ENI to configure eRDMA on an ECS instance, create a secondary ENI for which the ERI feature is enabled and bind the ENI to the ECS instance after the ECS instance is created. For more information, see Create an ENI and Bind an ENI.

      1.png

Configure eRDMA on an existing instance

  1. Log on to the ECS console. Find the ECS instance that you want to manage and click the instance ID to go to the Instance Details page. Click the Elastic Network Interface tab and check whether an ERI is bound to the instance.

    • If an ERI is bound to the instance, skip this step

    • If no ERI is bound to the instance, create a secondary ENI for which the ERI feature is enabled and bind the ENI to the instance.

      Note

      You can enable the ERI feature for a secondary ENI only when you create the ENI separately. You cannot enable the ERI feature for a secondary ENI when you create an ECS instance or after the ENI is created.

      1. Create a secondary ENI. For more information, see Create an ENI.

        • VPC and vSwitch: Select the VPC in which the ECS instance is deployed and the vSwitch to which the ECS instance is connected.

        • Elastic RDMA Interface: Turn on this switch.

          2.png

      2. Bind the secondary ENI to the ECS instance. For more information, see Bind an ENI.

        Note
        • Before you bind the secondary ENI to the ECS instance, make sure that the primary ENI of the instance and the secondary ENI are not connected to the same vSwitch. Otherwise, the RDMA feature of the secondary ENI may be unavailable in some cases due to the default route.

        • If you want to unbind a secondary ENI for which the ERI feature is enabled from an ECS instance, stop the instance before you unbind the ENI. For more information, see Stop an instance.

      3. Run the ifconfig command to view the secondary ENI. If information about the secondary ENI is not displayed in the command output, configure the secondary ENI. For more information, see Configure a secondary ENI. If information about the secondary ENI is displayed in the command output, skip this step.

        Note

        After secondary ENIs are bound to ECS instances, some images used by the ECS instances cannot recognize the new secondary ENIs.

  2. Install the eRDMA driver.

    Install the eRDMA driver manually or by using a script based on the actual scenario.

    • If you use a script to install the eRDMA driver, the installation package for the latest stable eRDMA driver version is automatically downloaded.

    • If you want to manually install the eRDMA driver, you can download the package for a specific eRDMA driver version.

    Use a script to install the eRDMA driver

    Run the following commands to execute a script to install the eRDMA driver:

    curl -O http://mirrors.cloud.aliyuncs.com/erdma/env_setup.sh
    sudo /bin/bash env_setup.sh > /var/log/erdma_install.log 2>&1

    The script automatically installs the dependency packages that are required by the eRDMA driver, downloads the eRDMA driver package, and installs the eRDMA driver. Wait for the script execution to complete.

    Note

    If the eRDMA driver fails to be installed by using the script, check the installation logs in the following path: /var/log/erdma_install.log.

    Manually install the eRDMA driver

    1. Update the prerequisite package.

      • For Alibaba Cloud Linux 3, CentOS, and Anolis OS, run the following command:

        sudo yum update -y
      • For Ubuntu, skip this step.

    2. Run the following commands in sequence to query the latest kernel package version and operating system kernel version:

      rpm -qa | grep kernel  #Query the latest kernel package version.
      uname -r  #Query the operating system kernel version.

      The command outputs shown in the following figure indicate that the kernel package version is the same as the operating system kernel version. In this case, you do not need to perform additional operations. If the versions are different, restart the ECS instance to make the versions the same.

      image.png

    3. Install dependency packages.

      • If the ECS instance is an x86 instance, run one of the following commands based on the instance operating system.

        • For Alibaba Cloud Linux 3, CentOS, and Anolis OS, run the following command:

          sudo yum install gcc-c++ dkms cmake kernel-devel kernel-headers libnl3 libnl3-devel
        • For Ubuntu, run the following command:

          sudo apt-get install dkms cmake libnl-3-dev libnl-route-3-dev kernel-headers
      • If the ECS instance is an Arm instance, the building task is executed based on the source code. In this case, a large number of dependencies are required and subject to change. You can skip this step and execute the installation script. If the installation script fails to install dependency packages, you are prompted to install the required dependency packages. Install the dependency packages as prompted and then re-install the eRDMA driver.

    4. Download the eRDMA driver installation package.

      • Run the following command to download the eRDMA driver installation package from an internal URL:

        wget http://mirrors.cloud.aliyuncs.com/erdma/erdma_installer-latest.tar.gz
      • Run the following command to download the eRDMA driver installation package from a public URL:

        wget https://mirrors.aliyun.com/erdma/erdma_installer-latest.tar.gz

      In this example, the installation package for the latest eRDMA driver version is downloaded. You can download the installation package for a specific eRDMA driver version based on your business scenarios. The following table describes the release notes for eRDMA driver versions.

      Release notes for eRDMA driver versions (ordered by release date, from the latest to the earliest)

      Version

      Release date

      Download URL

      Checksum

      Description

      1.3.3

      2023-10-09

      http://mirrors.cloud.aliyuncs.com/erdma/erdma_installer-1.3.3.tar.gz

      • MD5: 51ffb06266255139554275bc86fa4caa

      • SHA256: 5aad6d006662bd902ef5e913fb97d2a6623aadeeacd06f1c3f1c74cbd1f57ded

      This version is updated to include the latest patches.

      1.3.2

      2023-09-08

      http://mirrors.cloud.aliyuncs.com/erdma/erdma_installer-1.3.2.tar.gz

      • MD5: 8492016fc96eece6a60687b0e4ea66dd

      • SHA256: 89ab265dc9fa8d56f1b2d8b13d7f50032390a265eddb2e04eeee3aa86fd169ce

      This version is updated to include the latest patches.

      1.3.1

      2023-08-18

      http://mirrors.cloud.aliyuncs.com/erdma/erdma_installer-1.3.1.tar.gz

      • MD5: b9b90212e6ba49d57b81d3c5d4210deb

      • SHA256: 4ebe31760443613f8f61fcdbef7a85b277dabc59039d048898536ea4fe5d8d4a

      The underlying transmission mode on the driver side can be set to strong ordering. In strong ordering mode, data packets are transmitted to memory only in sequence.

      1.3.0

      2023-06-26

      http://mirrors.cloud.aliyuncs.com/erdma/erdma_installer-1.3.0.tar.gz

      • MD5: 2da0c65643b5e2ffb61d75e1b5e5a7ab

      • SHA256: cce03aac0e07d0890884c35ad4f10e9d15f587535d788c8fc97ea268312ad4a9

      • Multi-level page tables are supported during memory region (MR) registration.

      • The IPv6 feature is supported, and IPv6 support from underlying hardware is required.

      • Ubuntu 22.04 is supported.

      • This version is updated to include the latest patches.

      1.2.3

      2023-05-30

      http://mirrors.cloud.aliyuncs.com/erdma/erdma_installer-1.2.3.tar.gz

      • MD5: 7496a6324f3872469d7194c2e234b19f

      • SHA256: 16c2de0d90da6906db91c2e2469aaad9e24131c44ce52b9464036f1c3747f8a2

      This version is updated to include the latest patches.

      1.2.2

      2023-05-04

      http://mirrors.cloud.aliyuncs.com/erdma/erdma_installer-1.2.2.tar.gz

      • MD5: f449d3961a41ff6a97a53cfa29e20d6c

      • SHA256: 11fdb4b3c778762ad0bdf2d0327008aa2ecb22dc508c9f9fae3568b41ae5462b

      Ubuntu 22.04 is supported.

      1.2.1

      2023-04-04

      http://mirrors.cloud.aliyuncs.com/erdma/erdma_installer-1.2.1.tar.gz

      • MD5: e080103934da76ce83924da789aecece

      • SHA256: be3a89e57143d7544cf968052250df92f911aebb035f07b06ebeb8c5f13bf976

      This version is updated to include the latest patches.

      1.2.0

      2023-03-09

      http://mirrors.cloud.aliyuncs.com/erdma/erdma_installer-1.2.0.tar.gz

      • MD5: c8d440a6e35ec6d2aaf1a568affea876

      • SHA256: d484997e28e29f862dc580c112b55b389a00faf88dc6aa89eea588ee1369a8ca

      • The compatible mode is supported.

      • This version is updated to include the latest patches.

      1.1.0

      2023-01-16

      http://mirrors.cloud.aliyuncs.com/erdma/erdma_installer-1.1.0.tar.gz

      • MD5: 1fea69d819919a77384f902213eb681e

      • SHA256: 176c3bb35d5584e8c8e43eba9b1824b8cb2b43a19d802c4e469363ed8e33fea6

      This version is updated to include the latest patches.

    5. Run the following command to decompress the installation package and then go to the directory to which the installation package is decompressed:

      tar -xvf erdma_installer-latest.tar.gz && cd erdma_installer
    6. Install the eRDMA driver.

      • Method 1: Run the following command to install the eRDMA driver. During the installation process, confirm relevant uninstallation steps and automatic installation steps.

        sudo sh install.sh
      • Method 2: Run the following command to automatically install the eRDMA driver:

        sudo sh install.sh  --batch

      View the command output to check whether the driver is installed.

      The following command output indicates that the driver is installed.

      4.png

      The following command output indicates that the driver failed to be installed. Perform operations as prompted and then re-install the driver.

      5.png

      Note

      If the ECS instance runs CentOS 7 and you receive an error that packages are missing when you re-install the driver, you may fail to obtain the packages by running the yum commands. In this case, you may need to run the yum install -y epel-release command to install the Extra Packages for Enterprise Linux (EPEL) repository before you can obtain the packages.

Test the eRDMA performance

Note

The Perftest tool is a benchmark tool that you can use to test the basic performance of eRDMA. For more information, see Perftest documentation.

  1. Install the Perftest tool on a server and a client. Use one of the following methods to install the Perftest tool:

    • Method 1: Download the Perftest tool from the official perftest repository and install the tool. When you use this method to install the tool on an ECS instance, make sure that the instance can access the Internet.

    • Method 2: Use the YUM and APT repositories to install the Perftest tool. Run one of the following commands based on the instance operating system to install the Perftest tool.

      • Alibaba Cloud Linux 3, CentOS, and Anolis OS:

        sudo yum install perftest -y
      • Ubuntu:

        sudo apt install perftest -y
      Note

      Different versions of the Perftest tool are included in the repositories of different Linux distributions. Incompatibility may occur. When you use this method to install the Perftest tool, we recommend that you use the same Linux distribution on the communicating ECS instances. Otherwise, use Method 1.

  2. Test the eRDMA latency.

    1. Run the following command on the server:

      ib_write_lat -R -a -F
    2. Run the following command on the client:

      ib_write_lat -R -a -F <server_ip>

      <server_ip> specifies the private IP address of the ERI on the ECS instance that is used as the server. For information about how to obtain an IP address, see View IP addresses.

    The following command output is returned. The command output includes performance metrics, such as the average latency, maximum latency, and minimum latency, and indicates that eRDMA works as expected.测试结果

References

  • In scenarios in which large-scale data transfers and high-performance network communications are required in containers, you can use eRDMA in Docker environments to allow container applications to bypass the kernel and directly access physical eRDMA devices on hosts. This helps improve data transfer speeds and communication efficiency. For more information, see Configure eRDMA in Docker.

  • You can monitor and check the real-time working status of eRDMA. For more information, see Monitor and check eRDMA.