All Products
Search
Document Center

E-MapReduce:Create a Data Science cluster

Last Updated:Mar 26, 2026

Create an E-MapReduce (EMR) Data Science cluster on an existing Container Service for Kubernetes (ACK) cluster to run big data ETL workloads and GPU-accelerated AI training jobs.

Prerequisites

Before you begin, ensure that you have:

Important

Each ACK cluster can be associated with only one Data Science cluster.

Warning

Creating a Data Science cluster overwrites the following namespaces in the associated ACK cluster: anonymous, cert-manager, fluid-system, ingress-nginx, istio-system, knative-serving, kubeflow, kubernetes-dashboard, and monitoring.

Create a Data Science cluster

  1. Log on to the EMR console. In the left-side navigation pane, click EMR on ACK.

  2. On the EMR on ACK page, click Create Cluster.

  3. On the E-MapReduce on ACK page, configure the cluster parameters. See Parameter reference for details on each field.

  4. Click Create.

The cluster is ready when its status changes to Running.

Parameter reference

ParameterDescription
RegionThe region where the cluster is created. The region cannot be changed after the cluster is created.
Cluster typeSelect Data Science. Data Science clusters support offline ETL with Hive and Spark, and TensorFlow model training using a CPU+GPU heterogeneous computing framework with NVIDIA GPU deep learning algorithms — suited for big data and AI workloads.
Product versionThe EMR version to deploy. The latest version is selected by default.
Component versionRead-only. Displays the components and their versions included in the selected cluster type.
ACK ClusterSelect an existing ACK cluster, or go to the ACK console to create one.
Configure Dedicated Nodes(Optional) Add taints and labels to a node or node pool to reserve it exclusively for EMR workloads.
Cluster nameA name for the cluster. Must be 1–64 characters and can contain letters, digits, hyphens (-), and underscores (_).