You cannot call an EMR API operation to enable auto scaling by time or load. However, you can configure trigger conditions to call API operations to scale out a cluster.
Prerequisites
- An EMR cluster is created. For more information, see Create a cluster.
- The cluster ID is obtained. For more information, see View the cluster list and cluster details.
- An AccessKey pair is created. For more information, see Obtain an AccessKey pair.
- The corresponding SDK has been obtained. For more information, see Download SDKs and Install SDK.
Scenarios
You have created a Hadoop cluster, which consists of master, core, and task instances. You want to use an EMR API to add task instances.
Basic configurations of the cluster and the instances to be added:
- Cluster name: emr_openapi_demo, cluster ID: C-69CB0546800F****.
- Instances to be added: four task instances, ecs.c5.xlarge, one 120 GiB Enhanced SSD
as the system disk, four 80 GiB ultra disks as data disks.
After you scale out the cluster, perform the following steps to view the added instances:
- Log on to the Alibaba Cloud EMR console.
- Click the Cluster Management tab.
- Find the target cluster and click Details in the Actions column.
- In the left-side navigation pane, click Instances.
On the Instances page, view the added instances.
- HostGroupId of the task host group: G-C73605CF4382****.
Note When you scale out a cluster, you must obtain HostGroupId of the target host group. You can call an EMR API operation to obtain HostGroupId from the returned cluster information.
Examples
This section provides Java and Python sample code.
- Java
- Obtain HostGroupId of the target host group based on the region ID and cluster ID.
import com.aliyuncs.DefaultAcsClient; import com.aliyuncs.IAcsClient; import com.aliyuncs.exceptions.ClientException; import com.aliyuncs.exceptions.ServerException; import com.aliyuncs.profile.DefaultProfile; import com.google.gson.Gson; import java.util.*; import com.aliyuncs.emr.model.v20160408.*; public class DescribeClusterV2 { public static void main(String[] args) { DefaultProfile profile = DefaultProfile.getProfile("cn-hangzhou", "<accessKeyId>", "<accessSecret>"); IAcsClient client = new DefaultAcsClient(profile); DescribeClusterV2Request request = new DescribeClusterV2Request(); request.setRegionId("cn-hangzhou"); request.setId("C-69CB0546800F****"); try { DescribeClusterV2Response response = client.getAcsResponse(request); System.out.println(new Gson().toJson(response)); } catch (ServerException e) { e.printStackTrace(); } catch (ClientException e) { System.out.println("ErrCode:" + e.getErrCode()); System.out.println("ErrMsg:" + e.getErrMsg()); System.out.println("RequestId:" + e.getRequestId()); } } }
Find HostGroupId of the target host group in the returned cluster information.
For example, if the cluster information is returned in the JSON format, find HostGroupId based on the following path:ClusterInfo -> HostGroupList -> HostGroup -> HostGroupType=TASK -> HostGroupId
- Scale out the cluster.
import com.aliyuncs.DefaultAcsClient; import com.aliyuncs.IAcsClient; import com.aliyuncs.exceptions.ClientException; import com.aliyuncs.exceptions.ServerException; import com.aliyuncs.profile.DefaultProfile; import com.google.gson.Gson; import java.util.*; import com.aliyuncs.emr.model.v20160408.*; public class ResizeClusterV2 { public static void main(String[] args) { DefaultProfile profile = DefaultProfile.getProfile("cn-hangzhou", "<accessKeyId>", "<accessSecret>"); IAcsClient client = new DefaultAcsClient(profile); ResizeClusterV2Request request = new ResizeClusterV2Request(); request.setRegionId("cn-hangzhou"); request.setClusterId("C-69CB0546800F****"); List<ResizeClusterV2Request.HostGroup> hostGroupList = new ArrayList<ResizeClusterV2Request.HostGroup>(); ResizeClusterV2Request.HostGroup hostGroup1 = new ResizeClusterV2Request.HostGroup(); hostGroup1.setClusterId("C-69CB0546800F****"); hostGroup1.setHostGroupId("G-C73605CF4382****"); hostGroup1.setHostGroupName("task_group"); hostGroup1.setHostGroupType("TASK"); hostGroup1.setNodeCount(4); hostGroup1.setInstanceType("ecs.c5.xlarge"); hostGroupList.add(hostGroup1); request.setHostGroups(hostGroupList); try { ResizeClusterV2Response response = client.getAcsResponse(request); System.out.println(new Gson().toJson(response)); } catch (ServerException e) { e.printStackTrace(); } catch (ClientException e) { System.out.println("ErrCode:" + e.getErrCode()); System.out.println("ErrMsg:" + e.getErrMsg()); System.out.println("RequestId:" + e.getRequestId()); } } }
- Obtain HostGroupId of the target host group based on the region ID and cluster ID.
- Python
- Obtain HostGroupId of the target host group based on the region ID and cluster ID.
#! /usr/bin/env python #coding=utf-8 from aliyunsdkcore.client import AcsClient from aliyunsdkcore.acs_exception.exceptions import ClientException from aliyunsdkcore.acs_exception.exceptions import ServerException from aliyunsdkemr.request.v20160408.DescribeClusterV2Request import DescribeClusterV2Request client = AcsClient('<accessKeyId>', '<accessSecret>', 'cn-hangzhou') request = DescribeClusterV2Request() request.set_accept_format('json') request.set_Id("C-69CB0546800F****") response = client.do_action_with_exception(request) # python2: print(response) print(str(response, encoding='utf-8'))
Find HostGroupId of the target host group in the returned cluster information.
For example, if the cluster information is returned in the JSON format, find HostGroupId based on the following path:ClusterInfo -> HostGroupList -> HostGroup -> HostGroupType=TASK -> HostGroupId
- Scale out the cluster.
#! /usr/bin/env python #coding=utf-8 from aliyunsdkcore.client import AcsClient from aliyunsdkcore.acs_exception.exceptions import ClientException from aliyunsdkcore.acs_exception.exceptions import ServerException from aliyunsdkemr.request.v20160408.ResizeClusterV2Request import ResizeClusterV2Request client = AcsClient('<accessKeyId>', '<accessSecret>', 'cn-hangzhou') request = ResizeClusterV2Request() request.set_accept_format('json') request.set_ClusterId("C-69CB0546800F****") request.set_HostGroups([ { "ClusterId": "C-69CB0546800F****", "HostGroupId": "G-C73605CF4382****", "HostGroupName": "task_group", "HostGroupType": "TASK", "NodeCount": 4, "InstanceType": "ecs.c5.xlarge" } ]) response = client.do_action_with_exception(request) # python2: print(response) print(str(response, encoding='utf-8'))
- Obtain HostGroupId of the target host group based on the region ID and cluster ID.