×
Community Blog Achieving 57% Performance Increase for TCP with SMC-R

Achieving 57% Performance Increase for TCP with SMC-R

How does SMC-R help achieve a 57% performance increase for TCP applications? Read this blog to find out!

By SIG for High Performance Network

Transmission Control Protocol (TCP), as the most widely used network protocol, is commonly applied in mobile communication, data center, and other scenarios. For data centers, Shared Memory Communications over RDMA (SMC-R) of high-performance network protocol is implemented through elastic RDMA, and the application TCP protocol is transparently replaced to realize the transparent acceleration of application network.

In this blog, we'll discuss the origin of SMC-R and its significance in enhancing the performance of TCP applications. If you'd like to learn more about SMC-R, you can check out the following blogs: "Transparently Improve TCP Application Network Performance on the Cloud" and "SMC-R: A Hybrid Solution of TCP and RDMA".

Why Do We Need a New Kernel Network Protocol Stack?

1

The Linux kernel network protocol stack does not have a silver bullet. The current Linux network protocol stack is implemented under the tradeoff of performance (throughput and CPU usage), latency, and generality.

In real-world scenarios, we may need a high-performance but not common user-mode protocol stack, or a common, higher-performance and lower-latency solution. However, solutions based on traditional ethernet network interface controllers are difficult to be significantly improved, but benefit more from hardware, such as 100G/400G networks. In view of this, we consider whether we can provide TCP compatible behavior and socket interfaces based on other high-performance networks to provide better performance.

Network Communication Based on Shared Memory

Before talking about inter-host communication, let's focus on the stand-alone dimension. How can Inter-Process Communication (IPC) be implemented in the intra-host dimension? Here are several common IPC methods:

2

No doubt, shared memory is the fastest IPC method, but it lacks OS-level unified implementation and an interface. In most cases, the method is provided in the library of a specific programming language

Here, we break down the shared memory IPC process in the intra-host dimension:

  1. The sender writes data to a pre-allocated memory area.
  2. The sender notifies the receiver and updates the offset of the newly written memory.
  3. The receiver reads the data according to the newly updated offset.
  4. The receiver updates and reads the offset of the memory.

3

If there is a technology that can "move" memory between two machines, then we can extend this high-performance IPC solution from the stand-alone dimension to different hosts. Obviously, Remote Direct Memory Access (RDMA) can help us efficiently move memory.

Compared with the shared memory communication process of a single machine, the RDMA-based process is as follows:

  1. The sender writes data to a pre-allocated memory area of the machine.
  2. The sender writes the memory to the same location of the memory area maintained by the receiver through RDMA;
  3. The sender notifies the receiver through RDMA and updates the offset of the newly written memory;
  4. The receiver reads the data according to the newly updated offset;
  5. The receiver updates and reads the offset of the memory through RDMA.

4

Based on the shared memory model of RDMA, SMC-R came into being. SMC-R is short for Shared Memory Communication over RDMA.

Let's take a look at how SMC-R accelerates TCP applications.

5

SMC-R is a hybrid protocol. It uses TCP to realize information interaction during connection establishment, and uses RDMA network to realize high-performance data transmission on the data path. At the same time, once the RDMA connection fails to be established, it can fallback to TCP, providing minimum communication capability over TCP. In addition, SMC-R with multiple RNICs can implement the fault migration of runtime to ensure runtime reliability.

6

RDMA provides the verbs interface for applications. Based on shared memory model, SMC-R offers a set of kernel interfaces that are fully compatible with TCP sockets. It can use LD_PRELOAD to transparently replace TCP sockets with SMC sockets according to replacing rules based on eBPF, and then to implement transparent replacement and acceleration.

Based on SMC-R transparent replacement, we tested several application scenarios. Among them, Redis improves its performance by up to 57%. Redis can enjoy the performance acceleration brought by SMC-R without any modification.

7

Use SMC-R to Accelerate Applications

To realize transparent replacement and acceleration of TCP applications, you can use the following three solutions:

  1. Using LD_PRELOAD, the solution is to replace the SOCK_STREAM protocol created by the socket in the dynamically linked binary file with the AF_SMC protocol, thus realizing the transparent replacement of TCP with the SMC protocol.
  2. The net namespace-level sysctl can be used to replace all TCP connections at the latitude of the network namespace such as containers.
  3. The eBPF rules, such as quintuples and process IDs, are used to dynamically match connections to be replaced.

8
9

SMC-R in OpenAnolis

In OpenAnolis, we are also continuously enhancing and optimizing SMC, including performance, usage scenarios, stability and transparent replacement. SMC-R has participated in OpenAnolis for over six months, and contributed over 60 patches to the Linux upstream community.

10

You are welcome to communicate and exchange ideas in OpenAnolis. You can find relevant information from the links below.

Related Links

  1. Code warehouse: hpn-cloud-kernel
  2. High-performance network SIG: https://openanolis.cn/sig/high-perf-network

Previous technical articles:

  1. Alibaba Cloud Releases Fourth Generation X-Dragon Architecture; SMC-R Improves Network Performance by 20%
  2. SMC-R Interpretation Series – Part 1: Transparently Improve TCP Application Network Performance on the Cloud
  3. SMC-R Interpretation Series – Part 2: SMC-R: A hybrid solution of TCP and RDMA
0 0 0
Share on

OpenAnolis

83 posts | 5 followers

You may also like

Comments