All Products
Search
Document Center

Container Service for Kubernetes:Use DSA to accelerate data streaming

Last Updated:Apr 01, 2024

IntelĀ® Data Streaming Accelerator (DSA) is a high-performance data replication and transformation accelerator that is integrated into the Intel Sapphire Rapids (SPR) processors of Elastic Compute Service (ECS) instances that use the eighth-generation SHENLONG architecture. After you install ack-koordinator on nodes that are integrated with DSA, DSA acceleration is automatically enabled to accelerate data replication and transformation in dynamic random-access memory (DRAM), persistent memory, and data processing applications. This topic describes how to use DSA to accelerate data streaming.

Table of contents

Prerequisites

  • A Container Service for Kubernetes (ACK) Pro cluster that runs Kubernetes 1.18 or later is created. For more information, see Create an ACK managed cluster.

  • A kubectl client is connected to the cluster. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.

  • ack-koordinator (formerly known as ack-slo-manager) 1.2.0-ack1.2 or later is installed. For more information about how to install ack-koordinator, see ack-koordinator (FKA ack-slo-manager).

    Note

    ack-koordinator supports all features provided by resource-controller. If you use resource-controller, you must uninstall it before you install ack-koordinator. For more information about how to uninstall resource-controller, see Uninstall resource-controller.

  • The fifth-generation, sixth-generation, seventh-generation, or eighth-generation Elastic Compute Service (ECS) instances of the ecs.ebmc, ecs.ebmg, ecs.ebmgn, ecs.ebmr, ecs.ebmhfc, or ecs.scc instance families are used to deploy multi-NUMA instances.

    Note

    The nearby memory access acceleration feature functions better on the eighth-generation ECS instances of the ecs.ebmc8i.48xlarge, ecs.c8i.32xlarge, or ecs.g8i.48xlarge instance types. For more information about ECS instance families, see Overview of instance families.

Benefits

DSA is integrated into the processors of ECS instances that use the eighth-generation SHENLONG architecture. Alibaba Cloud provides relevant drivers based on Alinux 3. After you install ack-koordinator on ECS instances that are integrated with DSA, DSA acceleration is automatically enabled to transfer memory operations to DSA. This accelerates data replication and transformation, and mitigates CPU jitters during the acceleration process. DSA provides the following benefits:

  • DSA improves the data processing performance of data-intensive workloads on nodes, optimizes memory operations in the OS kernel such as memory balancing and compaction, and improves the overall memory performance of nodes.

  • DSA significantly improves the performance of the nearby memory access acceleration feature of ack-koordinator in handling individual data requests. The vCore hours consumed by workloads are reduced. The acceleration performance of DSA is improved when the usage of remote memory increases. The speed of accessing 100,000 to 1,000,000 memory pages can be improved by 30% to 200% and the CPU utilization is reduced. Approximately 1.7 GB of application memory is migrated to the local server. Compared with processors that are not integrated with DSA, the migration time is reduced to 31.25% and the bandwidth is increased to 320.00%.

    Important

    The test statistics provided in this topic are only theoretical values. The actual values may vary based on your environment.

    image

For more information about DSA, see Intel official documentation.

Use DSA acceleration

After you install ack-koordinator on ECS instances that are integrated with DSA, DSA acceleration is automatically enabled. No additional configuration is required. For more information about the nearby memory access acceleration feature of ack-koordinator, see Use the nearby memory access acceleration feature on multi-NUMA instances.

Verify DSA acceleration

The nearby memory access acceleration feature migrates the memory on the remote non-uniform memory access (NUMA) of a core-bound application to the local server in a secure manner. This improves the hit ratio of local memory access and optimizes memory access for memory-intensive workloads.

Test environment

To test DSA acceleration, you must use multi-NUMA instance types, such as ecs.ebmc8i.48xlarge, ecs.c8i.32xlarge, and ecs.g8i.48xlarge. In this example, ecs.ebmc8i.48xlarge is used.

Procedure

  1. Log on to the node and run the following command to check whether the processor of the node is integrated with DSA:

    ls /sys/bus/dsa

    If no error message appears and the returned directory is not empty, the processor is integrated with DSA.

  2. Deploy an application that has the nearby memory access acceleration feature enabled.

    We recommend that you deploy a memory-intensive application such as Redis.

Conclusions

The following table compares the processor that has DSA acceleration enabled and the processor that has DSA disabled in terms of the migration time and CPU utilization (based on 1,000,000 memory pages) when 26.12 GB of Redis remote memory is accelerated by using the nearby memory access acceleration feature.

Scenario

Migration time (seconds)

CPU utilization

vCore hour (seconds)

DSA acceleration disabled

9.649

1.000

9.649

DSA acceleration enabled

4.928

0.668

3.292

Conclusions: The migration time, average CPU utilization, and vCore hours when DSA acceleration is enabled are reduced to 51.8%, 66.8%, and 34.1% of those when DSA acceleration is disabled. This indicates that DSA can accelerate memory migration and reduce CPU utilization.