This topic describes how to use an Alibaba Cloud account to log on to the E-MapReduce (EMR) console and create a cluster on the EMR on ACK page.
Prerequisites
The AliyunOSSFullAccess and AliyunDLFFullAccess policies are attached to the Alibaba Cloud account. For more information, see Attach policies to a RAM role.
A Container Service for Kubernetes (ACK) cluster is created. For more information, see Create an ACK dedicated cluster or Create an ACK managed cluster.
Kubernetes version: Kubernetes of a version only from 1.22 to 1.24 is supported.
vCPU: The number of vCPUs must be greater than or equal to 16.
Memory: The memory size must be greater than or equal to 64 GiB.
Instance type:
Only general-purpose, compute-optimized, and memory-optimized instances are available.
Only the ecs.g5, ecs.g6, and ecs.g7 instance types and instance types with higher specifications are supported.
A node pool is created. For more information, see Create and manage a node pool.
Before you create an ACK cluster, take note of the following limits:
Precautions
You can associate an ACK cluster with only one Data Science cluster.
Procedure
Log on to the EMR console. In the left-side navigation pane, click EMR on ACK.
On the EMR on ACK page, click Create Cluster.
On the E-MapReduce on ACK page, configure the parameters. The following table describes the parameters.
Parameter
Description
Region
The region in which you want to create a cluster. You cannot change the region after the cluster is created.
Cluster Type
Data Science: Data Science clusters are commonly used in big data and AI scenarios. Data Science clusters support the offline extract, transform, and load (ETL) of big data based on Hive and Spark, and TensorFlow model training. You can use the CPU+GPU heterogeneous computing framework and deep learning algorithms supported by NVIDIA GPUs to run computing jobs more efficiently.
Product Version
The version of EMR. By default, the latest version is selected.
Component Version
Displays the type and version of the component that is deployed in the cluster of the specified type.
ACK Cluster
Select an existing ACK cluster or create an ACK cluster in the ACK console.
NoteThe following namespaces are available for a Data Science cluster: anonymous, cert-manager, fluid-system, ingress-nginx, istio-system, knative-serving, kubeflow, kubernetes-dashboard, and monitoring. If your ACK cluster has these namespaces, these namespaces are overwritten after a Data Science cluster that you want to associate with the ACK cluster is created.
Configure Dedicated Nodes
You can click Configure Dedicated Nodes to configure an EMR-dedicated node. You can configure an EMR-dedicated node or node pool by adding taints and labels to the node or node pool. This way, the node or node pool can be used only for EMR.
Cluster Name
The name of the cluster. The name must be 1 to 64 characters in length and can contain only letters, digits, hyphens (-), and underscores (_).
Click Create.
If the status of the cluster changes to Running, the cluster is created.