All Products
Search
Document Center

Resource Orchestration Service:ALIYUN::EMR::Cluster

Last Updated:Jul 05, 2023

ALIYUN::EMR::Cluster is used to create an E-MapReduce (EMR) cluster.

Note

ALIYUN::EMR::Cluster supports EMR API of the previous version and is available for clusters of the HADOOP, KAFKA, DRUID, ZOOKEEPER, DATA_SCIENCE, and GATEWAY types. ALIYUN::EMR::Cluster2 supports EMR API of the new version (2021-03-20) and is available for clusters of DATALAKE, OLAP, DATAFLOW, and DATASERVING types. We recommend that you use ALIYUN::EMR::Cluster2 to create clusters of these types.

Syntax

{
  "Type": "ALIYUN::EMR::Cluster",
  "Properties": {
    "SshEnable": Boolean,
    "EasEnable": Boolean,
    "WhiteListType": String,
    "InitCustomHiveMetaDB": Boolean,
    "IoOptimized": Boolean,
    "HostGroup": List,
    "Config": List,
    "KeyPairName": String,
    "VpcId": String,
    "AutoRenew": Boolean,
    "RelatedClusterId": String,
    "BootstrapAction": List,
    "InstanceGeneration": String,
    "DepositType": String,
    "VSwitchId": String,
    "NetType": String,
    "UserDefinedEmrEcsRole": String,
    "Name": String,
    "ClusterType": String,
    "ZoneId": String,
    "IsOpenPublicIp": Boolean,
    "OptionSoftWareList": List,
    "Configurations": String,
    "MasterPwd": String,
    "MachineType": String,
    "EmrVer": String,
    "SecurityGroupName": String,
    "MetaStoreConf": String,
    "SecurityGroupId": String,
    "LogPath": String,
    "Period": Integer,
    "HighAvailabilityEnable": Boolean,
    "UseCustomHiveMetaDB": Boolean,
    "UserInfo": List,
    "ChargeType": String,
    "MetaStoreType": String,
    "AuthorizeContent": String,
    "UseLocalMetaDb": Boolean,
    "ClickHouseConf": Map,
    "ResourceGroupId": String,
    "Tags": List
  }
}

Properties

Property

Type

Required

Editable

Description

Constraint

SshEnable

Boolean

No

No

Specifies whether to enable SSH.

Valid values:

  • true

  • false

EasEnable

Boolean

No

No

Specifies whether the cluster is a high-security cluster.

Valid values:

  • true

  • false

WhiteListType

String

No

No

The type of the whitelist.

Valid values:

  • IP: IP address

  • SecurityGroup: security group

InitCustomHiveMetaDB

Boolean

No

No

A reserved property. You do not need to specify this property.

None.

IoOptimized

Boolean

No

No

Specifies whether to enable I/O optimization.

Valid values:

  • true (default)

  • false

HostGroup

List

Yes

No

The node groups in the cluster.

For more information, see HostGroup properties.

Config

List

No

No

Details of the custom configuration items.

For more information, see Config properties.

KeyPairName

String

No

No

The name of the key pair.

None.

VpcId

String

No

No

The ID of the virtual private cloud (VPC).

None.

AutoRenew

Boolean

No

No

Specifies whether to enable auto-renewal for subscription clusters.

Valid values:

  • true

  • false

RelatedClusterId

String

No

No

The ID of the EMR cluster that is associated with the gateway cluster.

This property takes effect when ClusterType is set to GATEWAY.

BootstrapAction

List

No

No

Details of the bootstrap actions that you want to configure for the cluster.

For more information, see BootstrapAction properties.

InstanceGeneration

String

No

No

The Elastic Compute Service (ECS) instance family.

None.

VSwitchId

String

No

No

The ID of the vSwitch.

None.

NetType

String

Yes

No

The type of the network.

Set the value to VPC.

UserDefinedEmrEcsRole

String

No

No

The ECS application role that allows internal access from ECS to other Alibaba Cloud services, such as Object Storage Service (OSS).

None.

Name

String

Yes

Yes

The name of the cluster.

The name must be 1 to 64 characters in length, and can contain letters, digits, hyphens (-), and underscores (_).

ClusterType

String

Yes

No

The type of the cluster.

Valid values:

  • HADOOP

  • KAFKA

  • DRUID

  • ZOOKEEPER

  • DATA_SCIENCE

  • GATEWAY

ZoneId

String

Yes

No

The ID of the zone.

None.

IsOpenPublicIp

Boolean

No

No

Specifies whether to use public IP addresses.

Valid values:

  • true: uses public IP addresses. If you use public IP addresses, the default bandwidth is 8 Mbit/s.

  • false: does not use public IP addresses.

OptionSoftWareList

List

No

No

The list of available software.

None.

Configurations

String

No

No

A reserved property. You do not need to specify this property.

None.

MasterPwd

String

No

No

The SSH password that is used to access the master node.

The password must be 8 to 30 characters in length. The password must contain at least three of the following character types: uppercase letters, lowercase letters, digits, and special characters.

MachineType

String

No

No

The type of the node.

None.

EmrVer

String

Yes

No

The version of EMR.

None.

SecurityGroupName

String

No

No

The name of the security group.

If you do not specify SecurityGroupId, the system creates a new security group based on the value of SecurityGroupName.

After the cluster is created, you can query the ID of the security group on the cluster details page. The default security group policy is applied to the security group. The default policy allows inbound traffic only on port 22 and outbound traffic on all ports.

DepositType

String

No

No

The hosting type of the cluster.

None.

SecurityGroupId

String

No

No

The ID of the security group.

If you want to use an existing security group, the default security group policy is applied to the security group. The default policy allows inbound traffic only on port 22 and outbound traffic on all ports.

LogPath

String

No

No

The OSS path in which you want to store EMR logs.

None.

Period

Integer

No

No

The subscription duration of the cluster.

This property must be specified when ChargeType is set to PrePaid.

Valid values: 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 24, and 36.

Unit: month.

HighAvailabilityEnable

Boolean

No

No

Specifies whether to enable high availability for the cluster.

Valid values:

  • true: enables high availability. If you want to enable high availability for a cluster, make sure that the cluster contains at least two master nodes.

  • false: disables high availability.

UseCustomHiveMetaDB

Boolean

No

No

A reserved property. You do not need to specify this property.

None.

UserInfo

List

No

No

The information about the Knox account.

For more information, see UserInfo properties.

ChargeType

String

Yes

No

The billing method of the cluster.

Valid values:

  • PostPaid

  • PrePaid

AuthorizeContent

String

No

No

A reserved property. You do not need to specify this property.

None.

UseLocalMetaDb

Boolean

Yes

No

Specifies whether to use the built-in MySQL database of the cluster as the Hive metadatabase.

Valid values:

  • true

  • false

MetaStoreConf

String

No

No

The metadata configurations.

This property must be specified when MetaStoreType is set to user_rds.

Specify MetaStoreConf in the following format: {"dbUrl":"jdbc:mysql://xxxxxx", "dbUserName":"username", "dbPassword":"password"}.

MetaStoreType

String

No

No

The type of the metadata.

Valid values:

  • local: built-in MySQL database of the cluster

  • dlf: Data Lake Formation (DLF) metadata service

  • user_rds: self-managed ApsaraDB RDS service

ClickHouseConf

Map

No

No

The configurations of the ApsaraDB for ClickHouse cluster.

None.

ResourceGroupId

String

No

Yes

The ID of the resource group.

None.

Tags

List

No

Yes

The tags that you want to add to the cluster.

For more information, see Tags properties.

Tags syntax

"Tags": [
  {
    "Value": String,
    "Key": String
  }
]

Tags properties

Property

Type

Required

Editable

Description

Constraint

Key

String

Yes

No

The key of the tag.

The tag key must be 1 to 128 characters in length,

and cannot contain http:// or https://. The tag key cannot start with aliyun or acs:.

Value

String

No

No

The value of the tag.

The tag value can be up to 128 characters in length,

and cannot contain http:// or https://. The tag value cannot start with aliyun or acs:.

HostGroup syntax

"HostGroup": [
  {
    "Comment": String,
    "SysDiskType": String,
    "DiskCapacity": Integer,
    "NodeCount": Integer,
    "ClusterId": String,
    "DiskCount": Integer,
    "CreateType": String,
    "DiskType": String,
    "AutoRenew": Boolean,
    "HostGroupType": String,
    "SysDiskCapacity": Integer,
    "VSwitchId": String,
    "ChargeType": String,
    "Period": Integer,
    "HostKeyPairName": String,
    "HostPassword": String,
    "HostGroupId": String,
    "InstanceType": String,
    "GpuDriver": String,
    "HostGroupName": String
  }
]

HostGroup properties

Property

Type

Required

Editable

Description

Constraint

Comment

String

No

No

A reserved property. You do not need to specify this property.

None.

SysDiskType

String

Yes

No

The system disk category of the node group.

Valid values:

  • CLOUD_EFFICIENCY: ultra disk

  • CLOUD_SSD: standard SSD

  • CLOUD: basic disk

DiskCapacity

Integer

Yes

No

The data disk capacity of the node group.

Unit: GB.

NodeCount

Integer

Yes

No

The number of nodes in the node group.

None.

ClusterId

String

No

No

A reserved property. You do not need to specify this property.

None.

DiskCount

Integer

Yes

No

The number of data disks that you want to attach to the node group.

None.

CreateType

String

No

No

The mode based on which the node group is created.

Valid values:

  • ON-DEMAND: on-demand creation

  • MANUAL: manual creation

DiskType

String

Yes

No

The data disk category of the node group.

Valid values:

  • CLOUD_EFFICIENCY: ultra disk.

  • CLOUD_SSD: standard SSD.

  • LOCAL_DISK: local disk. If you use an ECS instance, set DiskType to LOCAL_DISK.

  • CLOUD: basic disk.

AutoRenew

Boolean

No

No

Specifies whether to enable auto-renewal for subscription clusters.

Valid values:

  • true

  • false

HostGroupType

String

Yes

No

The type of the node group.

Valid values:

  • MASTER: master node group

  • CORE: core node group

  • TASK: task node group

SysDiskCapacity

Integer

Yes

No

The system disk capacity of the node group.

Unit: GB.

VSwitchId

String

No

No

The ID of the vSwitch.

None.

ChargeType

String

Yes

No

The billing method.

Valid values:

  • PostPaid

  • PrePaid

Period

Integer

No

No

The subscription duration.

This property must be specified when ChargeType is set to PrePaid.

Valid values: 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 24, and 36.

Unit: month.

HostKeyPairName

String

No

No

The name of the key pair that is used to access the node group.

This property takes effect when ClusterType is set to GATEWAY.

HostPassword

String

No

No

The password that is used to access the node.

This property takes effect when ClusterType is set to GATEWAY.

HostGroupId

String

No

No

A reserved property. You do not need to specify this property.

None.

InstanceType

String

Yes

No

The instance type.

None.

GpuDriver

String

No

No

The GPU driver.

None.

HostGroupName

String

No

No

The name of the node group.

None.

Config syntax

"Config": [
  {
    "Encrypt": String,
    "ConfigKey": String,
    "FileName": String,
    "ServiceName": String,
    "Replace": String,
    "ConfigValue": String
  }
]

Config properties

Property

Type

Required

Editable

Description

Constraint

Encrypt

String

No

No

A reserved property. You do not need to specify this property.

None.

ConfigKey

String

No

No

The name of the custom configuration item.

None.

FileName

String

No

No

The name of the file that contains the custom configuration item.

None.

ServiceName

String

No

No

The name of the service to which the custom configuration item belongs.

None.

Replace

String

No

No

A reserved property. You do not need to specify this property.

None.

ConfigValue

String

No

No

The value of the custom configuration item.

None.

BootstrapAction syntax

"BootstrapAction": [
  {
    "Path": String,
    "Name": String,
    "Arg": String
  }
]

BootstrapAction properties

Property

Type

Required

Editable

Description

Constraint

Path

String

No

No

The OSS path in which you want to store the bootstrap action script.

None.

Name

String

No

No

The name of the bootstrap action.

None.

Arg

String

No

No

The parameter of the bootstrap action.

None.

UserInfo syntax

"UserInfo": [
  {
    "UserName": String,
    "Password": String,
    "UserId": String
  }
]

UserInfo properties

Property

Type

Required

Editable

Description

Constraint

UserName

String

No

No

The username of the Knox account.

None.

Password

String

No

No

The password of the Knox account.

None.

UserId

String

No

No

The RAM user ID of the Knox account.

None.

Return values

Fn::GetAtt

  • ClusterId: the ID of the cluster.

  • HostGroups: the node groups in the cluster.

  • MasterNodePubIps: the public IP addresses of master nodes in the cluster.

  • MasterNodeInnerIps: the private IP addresses of master nodes in the cluster.

Examples

YAML format

ROSTemplateFormatVersion: '2015-09-01'
Description: Test EMR Cluster
Parameters:
  VpcId:
    AssociationProperty: ALIYUN::ECS::VPC::VPCId
    Type: String
    Label:
       
      en: Existing VPC Instance ID
  ZoneId:
    AssociationProperty: ALIYUN::ECS::ZoneId
    Type: String
    Label:
       
      en: VSwitch Zone ID
  VSwitchId:
    AssociationProperty: ALIYUN::ECS::VSwitch::VSwitchId
    AssociationPropertyMetadata:
      VpcId: ${VpcId}
      ZoneId: ${ZoneId}
    Type: String
    Label:
       
      en: VSwitch ID
  SecurityGroupId:
    AssociationProperty: ALIYUN::ECS::SecurityGroup::SecurityGroupId
    AssociationPropertyMetadata:
      VpcId: ${VpcId}
    Type: String
    Description:
      Label:
         
        en: Business Security Group ID
    ClusterDiskType:
      Type: String
      Default: CLOUD_SSD
Resources:
  EmrCluster:
    Type: ALIYUN::EMR::Cluster
    Properties:
      UseLocalMetaDb: false
      IoOptimized: true
      ZoneId:
        Ref: ZoneId
      VSwitchId:
        Ref: VSwitchId
      SecurityGroupId:
        Ref: SecurityGroupId
      HostGroup:
        - DiskType: CLOUD_SSD
          HostGroupType: MASTER
          DiskCount: 1
          DiskCapacity: 80
          NodeCount: 1
          SysDiskType: CLOUD_SSD
          ChargeType: PostPaid
          VSwitchId:
            Ref: VSwitchId
          AutoRenew: false
          Period: 1
          SysDiskCapacity: 120
          InstanceType: ecs.g5.xlarge
        - DiskType: CLOUD_SSD
          HostGroupType: CORE
          DiskCount: 4
          DiskCapacity: 80
          NodeCount: 2
          SysDiskType: CLOUD_SSD
          ChargeType: PostPaid
          VSwitchId:
            Ref: VSwitchId
          AutoRenew: false
          Period: 1
          SysDiskCapacity: 120
          InstanceType: ecs.g5.xlarge
      EmrVer: EMR-3.22.4
      ClusterType: HADOOP
      Name:
        Fn::Join:
          - '-'
          - - StackId
            - Ref: ALIYUN::StackId
      MasterPwd: Admin123!
      VpcId:
        Ref: VpcId
      ChargeType: PostPaid
      NetType: vpc
Outputs:
  ClusterId:
    Description: The ID of the cluster.
    Value:
      Fn::GetAtt:
        - EmrCluster
        - ClusterId

JSON format

{
  "ROSTemplateFormatVersion": "2015-09-01",
  "Description": "Test EMR Cluster",
  "Parameters": {
    "VpcId": {
      "AssociationProperty": "ALIYUN::ECS::VPC::VPCId",
      "Type": "String",
      "Label": {
         
        "en": "Existing VPC Instance ID"
      }
    },
    "ZoneId": {
      "AssociationProperty": "ALIYUN::ECS::ZoneId",
      "Type": "String",
      "Label": {
         
        "en": "VSwitch Zone ID"
      }
    },
    "VSwitchId": {
      "AssociationProperty": "ALIYUN::ECS::VSwitch::VSwitchId",
      "AssociationPropertyMetadata": {
        "VpcId": "${VpcId}",
        "ZoneId": "${ZoneId}"
      },
      "Type": "String",
      "Label": {
         
        "en": "VSwitch ID"
      }
    },
    "SecurityGroupId": {
      "AssociationProperty": "ALIYUN::ECS::SecurityGroup::SecurityGroupId",
      "AssociationPropertyMetadata": {
        "VpcId": "${VpcId}"
      },
      "Type": "String",
      "Description": {
        "Label": {
           
          "en": "Business Security Group ID"
        }
      },
      "ClusterDiskType": {
        "Type": "String",
        "Default": "CLOUD_SSD"
      }
    }
  },
  "Resources": {
    "EmrCluster": {
      "Type": "ALIYUN::EMR::Cluster",
      "Properties": {
        "UseLocalMetaDb": false,
        "IoOptimized": true,
        "ZoneId": {
          "Ref": "ZoneId"
        },
        "VSwitchId": {
          "Ref": "VSwitchId"
        },
        "SecurityGroupId": {
          "Ref": "SecurityGroupId"
        },
        "HostGroup": [
          {
            "DiskType": "CLOUD_SSD",
            "HostGroupType": "MASTER",
            "DiskCount": 1,
            "DiskCapacity": 80,
            "NodeCount": 1,
            "SysDiskType": "CLOUD_SSD",
            "ChargeType": "PostPaid",
            "VSwitchId": {
              "Ref": "VSwitchId"
            },
            "AutoRenew": false,
            "Period": 1,
            "SysDiskCapacity": 120,
            "InstanceType": "ecs.g5.xlarge"
          },
          {
            "DiskType": "CLOUD_SSD",
            "HostGroupType": "CORE",
            "DiskCount": 4,
            "DiskCapacity": 80,
            "NodeCount": 2,
            "SysDiskType": "CLOUD_SSD",
            "ChargeType": "PostPaid",
            "VSwitchId": {
              "Ref": "VSwitchId"
            },
            "AutoRenew": false,
            "Period": 1,
            "SysDiskCapacity": 120,
            "InstanceType": "ecs.g5.xlarge"
          }
        ],
        "EmrVer": "EMR-3.22.4",
        "ClusterType": "HADOOP",
        "Name": {
          "Fn::Join": [
            "-",
            [
              "StackId",
              {
                "Ref": "ALIYUN::StackId"
              }
            ]
          ]
        },
        "MasterPwd": "Admin123!",
        "VpcId": {
          "Ref": "VpcId"
        },
        "ChargeType": "PostPaid",
        "NetType": "vpc"
      }
    }
  },
  "Outputs": {
    "ClusterId": {
      "Description": "The ID of the cluster.",
      "Value": {
        "Fn::GetAtt": [
          "EmrCluster",
          "ClusterId"
        ]
      }
    }
  }
}