Creates a E-MapReduce cluster.

Note If you create an EMR cluster for the first time after 17:00 PM on December 19th, 2022 (UTC+8), you will not be able to use this interface to create the cluster.

Debugging

OpenAPI Explorer automatically calculates the signature value. For your convenience, we recommend that you call this operation in OpenAPI Explorer. OpenAPI Explorer dynamically generates the sample code of the operation for different SDKs.

Request parameters

ParameterTypeRequiredExampleDescription
ActionStringYesCreateClusterV2

The operation that you want to perform. Set the value to CreateClusterV2.

BootstrapAction.N.NameStringYesinit_script

The name of the bootstrap action.

BootstrapAction.N.PathStringYesoss://bucket/path

The OSS path in which you want to store the bootstrap action script.

ClusterTypeStringYesHADOOP

The type of the cluster. Valid values:

  • HADOOP
  • KAFKA
  • DATA SCIENCE
  • DRUID
  • FLINK
  • GATEWAY
  • CLICKHOUSE
  • SHUFFLE_SERVICE
  • EMR_STUDIO
Config.N.ConfigKeyStringYesfs.trash.interval

The key of custom configuration item N.

Config.N.ConfigValueStringYes60

The value of custom configuration item N.

Config.N.FileNameStringYesyarn-site

The name of the file that contains custom configuration item N.

Config.N.ServiceNameStringYesYARN

The name of the service for which custom configuration item N is configured. Specify the entire name in uppercase.

EmrVerStringYesEMR-3.35.0

The version of EMR.

Note You can view the EMR version list when you create a cluster in the console.
HostGroup.N.DiskCapacityIntegerYes80

The capacity of the machine group data disk. Unit: GB.

HostGroup.N.DiskCountIntegerYes4

The number of data disks in host group N.

HostGroup.N.DiskTypeStringYesCLOUD_EFFICIENCY

The type of the machine group data disk. Valid values:

  • CLOUD_EFFICIENCY: ultra disk.
  • CLOUD_SSD: standard SSD.
  • LOCAL_DISK: local disk. If you use an ECS instance, you must set the DiskType property to LOCAL_DISK.
  • CLOUD: basic cloud disk (not recommended).
HostGroup.N.HostGroupNameStringYesMaster instance group

The name of the machine group.

HostGroup.N.HostGroupTypeStringYesMASTER

The type of the machine group. Valid values:

  • MASTER: master node group
  • CORE: core node group
  • TASK: task node group
Note Both MASTER and CORE support setting up only one group.
HostGroup.N.InstanceTypeStringYesecs.g6.2xlarge

The instance type. For more information, see Overview of instance families or call the DescribeInstanceTypes operation to query the most recent instance type list.

HostGroup.N.NodeCountIntegerYes2

The number of nodes in host group N.

HostGroup.N.SysDiskCapacityIntegerYes80

The system disk capacity of the machine group. Unit: GB.

HostGroup.N.SysDiskTypeStringYesCLOUD_SSD

The type of the machine group system disk. Valid values:

  • CLOUD_EFFICIENCY: ultra disk.
  • CLOUD_SSD: standard SSD.
  • CLOUD: basic cloud disk (not recommended).
NameStringYesbi_hadoop

The name of the cluster. The name must be 1 to 64 characters in length, and can contain only letters, digits, hyphens (-), and underscores (_).

RegionIdStringYescn-hangzhou

The region ID of the command. You can call the DescribeRegions operation to query the most recent region list.

UserInfo.N.PasswordStringYespwd

The password of the Knox account.

UserInfo.N.UserIdStringYes123456789

The RAM user ID of the Knox account.

UserInfo.N.UserNameStringYesusername

The username of the Knox account.

ZoneIdStringYescn-hangzhou-b

The zone ID of the cluster. You can call the DescribeZones operation to query the most recent zone list.

SecurityGroupIdStringNosg-bp1id7ajv83kmqwq****

The ID of the security group. You can enter an existing security group ID. If the security group does not exist, a security group is automatically created.

Note Parameter SecurityGroupId and SecurityGroupName requirement cannot both be empty.
IsOpenPublicIpBooleanNotrue

Indicates whether the public IP address is enabled for the MASTER node. Valid values:

  • true: enables the public IP address. If you enable this feature, the default bandwidth is 8 MB.
  • false: disables public IP addresses.
SecurityGroupNameStringNoemr-sg

The name of the security group to be created. If you do not specify SecurityGroupId, a security group is created with the value of this parameter as its name. After the cluster is created, you can view the ID of the security group in cluster details. This security group will have the default security group policy: open all ports in the outbound direction.

Note Parameter SecurityGroupId and SecurityGroupName requirement cannot both be empty.
ChargeTypeStringNoPostPaid

The billing method of the cluster. Valid values:

  • PostPaid: pay-as-you-go
  • PrePaid: subscription
PeriodIntegerNo2

The subscription period. Valid values: 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 24, and 36. Unit: months. This parameter is required when the ChargeType parameter is set to PrePaid.

AutoRenewBooleanNofalse

Indices whether the cluster is auto-renewal. Valid values:

  • true: enables auto-renewal.
  • false: does not enable auto-renewal.
AutoPayOrderBooleanNotrue

Specifies whether to enable automatic payment. This parameter takes effect only when the payment type is set to PrePay. Valid values:

  • true: The system automatically renews the instance.
  • false
VpcIdStringNovpc-bp1l4urd87xlh7i4b****

The ID of the virtual private cloud (VPC) to which the instances belong.

VSwitchIdStringNovsw-bp10tvjyc77psy0z5****

The vSwitch ID of the cluster.

NetTypeStringNoVPC

Indicates the network type of the ApsaraDB RDS for MySQL instance. Set the value to VPC.

UserDefinedEmrEcsRoleStringNoAliyunEmrEcsDefaultRole

The role that is authorized to ECS to access OSS and other Alibaba Cloud services:

  • AliyunECSInstanceForEMRRole :3.x series>= EMR-3.33.0, 4.x series>=EMR-4.6.0, 5.x series>=EMR-5.1.0.
  • AliyunEmrEcsDefaultRole:3.x series < EMR-3.33.0, 4.x series <EMR-4.6.0, 5.x series <EMR-5.1.0.
OptionSoftWareList.NRepeatListNo["ZOOKEEPER","LIVY"]

The optional service that is supported. The service name must be uppercase. You can view the supported optional services on the Software Configuration page of the created cluster in the EMR console.

Note You can specify optional components. The length of this parameter is limited. You can specify no more than 20 service names. Otherwise, the extra parameters are discarded.
HighAvailabilityEnableBooleanNotrue

Specifies whether to enable the high-availability cluster. Valid values:

  • true: enables the high-availability cluster. A high-availability cluster requires two master nodes.
  • false: disables high-availability clusters.

A high-availability cluster has at least two master nodes. If the high-availability cluster is not enabled, high reliability cannot be guaranteed.

UseLocalMetaDbBooleanNotrue

Specifies whether to use the built-in MySQL database of the cluster as the Hive metadatabase. Valid values:

  • true: The local Hive metadatabase is used.
  • false: The local Hive metadatabase is disabled.

A single MySQL node is built in a cluster and cannot guarantee high reliability.

MasterPwdStringNopwd

The root password of the master node. The password must be 8 to 30 characters in length and contain any three characters (uppercase letters, lowercase letters, digits, and special character).

KeyPairNameStringNotest_pair

Password-free logon to the ECS key pair.

MetaStoreTypeStringNolocal

The type of the Hive metadata service. Valid values:

  • local: the MYSQL service in the cluster. A single MYSQL node does not guarantee high availability.
  • user_rds: self-managed ApsaraDB RDS service.
  • dlf: Data Lake Formation (DLF) metadata service.
MetaStoreConfStringNo{"dbUrl":"jdbc:mysql://rm-xxxxxxxxxx.mysql.rds.aliyuncs.com/hmsdata?createDatabaseIfNotExist=true&amp;characterEncoding=UTF-8","dbUserName":"xxxxxxx","dbPassword":"xxxxxx"}

The configuration of the unified metadata service. Valid values:

  • If MetaStoreType is set to local or dlf, this parameter is not specified.
  • MetaStoreType:user_rds. Valid values:

    {"dbUrl":"jdbc:mysql://rm- *.mysql.rds.aliyuncs.com/hmsdata?createDatabaseIfNotExist=true&characterEncoding=UTF-8","dbUserName":"name *","dbPassword":"pws "}.

ClickHouseConfStringNoNone

A reserved parameter. You do not need to specify this parameter.

ExtraAttributesStringNoNone

A reserved parameter. You do not need to specify this parameter.

HostComponentInfo.N.HostNameStringNoemr-header-1

The target hostname of the component deployment.

HostComponentInfo.N.ServiceNameStringNoHDFS

The name of the service to which the component belongs. The service name is in uppercase letters, such as HDFS and ZOOKEEPER. You can view the optional services in Step 1-Software Configuration-Service List to create a cluster in the EMR console.

HostComponentInfo.N.ComponentNameList.NRepeatListNoNAMENODE

The information about the array object.

ServiceInfo.N.ServiceNameStringNoHDFS

The name of the service. The service name is in uppercase letters, such as HDFS and ZOOKEEPER. You can view the optional services in Step 1-Software Configuration-Service List to create a cluster in the EMR console.

ServiceInfo.N.ServiceVersionStringNo2.3.3-1.0.2

The internal version of the service.

PromotionInfo.N.PromotionOptionNoStringNo11080***0000

The coupon ID.

PromotionInfo.N.PromotionOptionCodeStringNoyouhui_quan

The type of the coupon. This parameter is optional. Default value: youhui_quan.

PromotionInfo.N.ProductCodeStringNoecs

The product to which the coupon is applied. Valid values:

  • emr : applies to EMR orders.
  • ecs : applies to ECS orders.
DepositTypeStringNoHALF_MANAGED

The hosting type of the cluster. Set the value to HALF_MANAGED.

Semi-hosted indicates that the EMR cluster uses a user-side ECS or ACK cluster.

MachineTypeStringNoECS

The resource type of the IaaS layer to which the cluster is built.

HostGroup.N.ClusterIdStringNoNone

A reserved parameter. You do not need to specify this parameter.

HostGroup.N.HostGroupIdStringNoNone

A reserved parameter. You do not need to specify this parameter.

HostGroup.N.CommentStringNoNone

A reserved parameter. You do not need to specify this parameter.

HostGroup.N.CreateTypeStringNoNone

The type of the machine group. Valid values:

  • ON-DEMAND: creates as needed.
  • MANUAL: manually created.
HostGroup.N.ChargeTypeStringNoPostPaid

The billing method of the instance. Valid values:

  • Postpaid: pay-as-you-go
  • PrePaid: the subscription billing method.
HostGroup.N.PeriodIntegerNo2

The subscription period. Valid values: 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 24, and 36. Unit: months. Set this parameter when HostGroup.n.ChargeType is set to PrePaid.

HostGroup.N.AutoRenewBooleanNofalse

machine group whether the machine is auto-renewal. Valid values:

  • true: enables auto-renewal.
  • false: manual renewal.
HostGroup.N.VSwitchIdStringNovsw-bp10tvjyc77psy0z5****

The vSwitch ID of the cluster.

HostGroup.N.GpuDriverStringNocuda9

The name of the GPU driver.

HostGroup.N.PrivatePoolOptionsMatchCriteriaStringNoTarget

The type of the private pool to use to create the instance. A private pool is generated when an elasticity assurance or a capacity reservation takes effect. You can select a private pool when you create an instance. Valid values:

  • Open: open private pool. The system selects a matching open private pool to create the instance. If no matching private pools are found, resources in the public pool are used. In this mode, you do not need to set the HostGroup.N.PrivatePoolOptionsId parameter.
  • Target: specified private pool. The system uses the capacity in a specified private pool to create the instance. If the specified private pool is unavailable, the instance cannot be created. In this mode, you must specify the ID of the private pool. The HostGroup.N.PrivatePoolOptionsId parameter is required.
  • None: no private pool. The capacity in private pools is not used.
HostGroup.N.PrivatePoolOptionsIdStringNocrp-bp1e4wcvoucrish*****

The ID of the private pool to use to create the instance. The ID of a private pool is the same as that of the elasticity assurance or capacity reservation for which the private pool is generated.

BootstrapAction.N.ArgStringNotest1 test2

Boot operation parameters.

If you need to specify multiple parameters, you can separate them with spaces. test1 test2, for example.

This is the mkdir /root/$1;mkdir /root/$2; in the script. After successful execution, folders named test1 and test2 are created in the /root directory.

BootstrapAction.N.ExecutionTargetStringNocore_group

The execution scope of the script. Valid values:

  • The entire cluster. You can specify a null value.
  • The name of the machine group.
BootstrapAction.N.ExecutionMomentStringNoBEFORE_INSTALL

The timing of the script. Valid values:

  • BEFORE_INSTALL: Before the cluster service is installed.
  • AFTER_STARTED: The task is run after the cluster service is started.
BootstrapAction.N.ExecutionFailStrategyStringNoFAILED_BLOCKED

The execution policy for script failures. Valid values:

  • FAILED_BLOCKED: The workflow is manually processed after a failure.
  • FAILED_CONTINUE: Continue execution after failure.
UseCustomHiveMetaDBBooleanNofalse

A reserved parameter. You do not need to specify this parameter.

InitCustomHiveMetaDBBooleanNofalse

A reserved parameter. You do not need to specify this parameter.

Config.N.EncryptStringNo0

A reserved parameter. You do not need to specify this parameter.

Config.N.ReplaceStringNo0

A reserved parameter. You do not need to specify this parameter.

ConfigurationsStringNo0

A reserved parameter. You do not need to specify this parameter.

EasEnableBooleanNofalse

Specifies whether to enable high security for the cluster. Valid values:

  • true: high-security clusters.
  • false: The cluster is not a high-security cluster.
RelatedClusterIdStringNoC-D7958B72E59B****

The ID of the cluster that is associated with the primary cluster if the current cluster is a gateway.

WhiteListTypeStringNoIP

The type of the whitelist. Valid values:

  • IP: IP address-based whitelist
  • SecurityGroup: security group-based whitelist
AuthorizeContentStringNoNone

A reserved parameter. You do not need to specify this parameter.

Tag.N.KeyStringNoDept

The tag key of the EMR instance and the node ECS instance. You can specify 1 to 20 tag keys. The tag value cannot be an empty string. The value of a tag key ranges from 1 to 128. It cannot start with aliyun or acs:, and cannot contain http:// or https://.

Tag.N.ValueStringNoDevIT

The tag value of the EMR instance and the node ECS instance. You can specify 1 to 20 tag values. It can be an empty string. Valid values of tag values: 1 to 128. The tag value cannot start with acs: and cannot contain http:// or https://.

ResourceGroupIdStringNorg-bp67acfmxazb4p****

The ID of the resource group to which the EMR cluster belongs.

ClientTokenStringNo123e4567-e89b-12d3-a456-42665544****

The client token that is used to ensure the idempotence of the request. You can use the client to generate a client token. Make sure that a unique client token is used for each request.

Response parameters

ParameterTypeExampleDescription
ClusterIdStringC-D7958B72E59B****

The ID of the cluster.

CoreOrderIdStringNone

The order ID of the core node.

EmrOrderIdStringNone

E-MapReduce order ID.

MasterOrderIdStringNone

The order ID of the master node.

RequestIdStringBF4FBAC6-B03E-4BFB-B6DB-EB53C34F2E22

The request ID.

Examples

Sample requests

http(s)://[Endpoint]/?Action=CreateClusterV2
&BootstrapAction.1.Name=name
&BootstrapAction.1.Path=oss://bucket/path
&ClusterType=HADOOP
&Config.1.ConfigKey=fs.trash.interval
&Config.1.ConfigValue=60
&Config.1.FileName=yarn-site
&Config.1.ServiceName=YARN
&EmrVer=EMR-3.15.0
&HostGroup.1.HostGroupType=MASTER
&HostGroup.1.InstanceType=ecs.mn4.2xlarge
&HostGroup.1.NodeCount=2
&Name=bi_hadoop
&RegionId=cn-hangzhou
&UserInfo.1.Password=pwd
&UserInfo.1.UserId=12345
&UserInfo.1.UserName=tom
&<Common request parameters>

Sample success responses

XML format

<ClusterId>C-4DE6DA872B0E****</ClusterId>
<RequestId>F4DE89FB-7054-475C-B7E2-B9A38152DA7E</RequestId>

JSON format

{
    "ClusterId": "C-4DE6DA872B0E****",
    "RequestId": "F4DE89FB-7054-475C-B7E2-B9A38152DA7E"
}