Remote Shuffle Service (RSS) is an E-MapReduce (EMR) extension that improves the stability and performance of Spark Shuffle and enables dynamic resource allocation in Container Service for Kubernetes (ACK) clusters. This topic describes how to associate a Spark cluster with a Shuffle Service cluster on the EMR on ACK page.
Why RSS
Spark Shuffle in ACK clusters has the following limitations:
-
Storage dependency: Spark Shuffle requires local storage. On compute-storage-separated nodes or elastic container instances without local disks, you must purchase and attach disks, increasing cost and reducing efficiency.
-
Dynamic allocation gaps: Spark 2 does not support dynamic allocation. Spark 3 supports it via Shuffle tracking, but executor recycling efficiency is low.
-
Write amplification: Data overflow in shuffle write tasks causes write amplification.
-
Connection reset: A large number of small-size network packets in shuffle read tasks causes connection resets.
-
High disk and CPU load: Shuffle read tasks generate many small-size I/O requests and random reads.
-
Connection scaling: With thousands of mappers (M) and reducers (N), the M x N connection count makes jobs difficult to run.
RSS eliminates these limitations and provides native dynamic allocation support in ACK clusters.
Prerequisites
Before you begin, make sure that you have:
-
A Spark cluster created on the EMR on ACK page. For more information, see Step 2: Create a cluster.
-
A Shuffle Service cluster created on the EMR on ACK page. For more information, see Step 2: Create a cluster.
-
Both clusters in the same ACK cluster. Cross-ACK-cluster associations are not supported.
-
Matching major EMR versions on both clusters. A version mismatch causes compatibility issues that can prevent jobs from running. You can check the version on the Cluster Details tab for each cluster.
Associate a Spark cluster with a Shuffle Service cluster
-
Log on to the EMR console. In the left-side navigation pane, click EMR on ACK.
-
On the EMR on ACK page, find the Spark cluster and click its name in the Cluster ID/Name column.
-
On the Cluster Details tab, go to the Basic Information section and click Associate Now to the right of Associate RSS Cluster.
-
On the Service Details tab, go to the Associated Cluster section and click Add.
-
In the Associated Cluster dialog box, select your Shuffle Service cluster from the Cluster drop-down list and click Associate.
What's next
-
(Optional) Configure RSS parameters. For more information, see Celeborn parameters.