All Products
Search
Document Center

Resource Orchestration Service:ALIYUN::EMR::Cluster2

Last Updated:Dec 13, 2023

ALIYUN::EMR::Cluster2 is used to create an E-MapReduce (EMR) cluster.

Note

ALIYUN::EMR::Cluster supports the EMR API of the previous version and is available for clusters of the HADOOP, KAFKA, DRUID, ZOOKEEPER, DATA_SCIENCE, and GATEWAY types. ALIYUN::EMR::Cluster2 supports the EMR API of the new version (2021-03-20) and is available for clusters of the DATALAKE, OLAP, DATAFLOW, and DATASERVING types. We recommend that you use ALIYUN::EMR::Cluster2 to create clusters of these types.

Syntax

{
  "Type": "ALIYUN::EMR::Cluster2",
  "Properties": {
    "Applications": List,
    "ResourceGroupId": String,
    "ApplicationConfigs": List,
    "ClusterType": String,
    "NodeGroups": List,
    "ReleaseVersion": String,
    "BootstrapScripts": List,
    "SubscriptionConfig": Map,
    "DeployMode": String,
    "SecurityMode": String,
    "NodeAttributes": Map,
    "ClusterName": String,
    "PaymentType": String,
    "Tags": List
  }
}

Properties

Property

Type

Required

Editable

Description

Constraint

Applications

List

Yes

No

The applications that you want to add to the cluster.

You can add up to 100 applications to the cluster.

For more information, see Applications property.

ResourceGroupId

String

No

No

The ID of the resource group.

None.

ApplicationConfigs

List

No

No

The configurations of the applications.

You can add up to 1,000 application configurations.

For more information, see ApplicationConfigs properties.

ClusterType

String

Yes

No

The cluster type.

Valid values:

  • DATALAKE: data lake

  • OLAP: online analytical processing (OLAP)

  • DATAFLOW: Dataflow

  • DATASERVING: DataServing

NodeGroups

List

Yes

No

The array of configurations of the node groups.

You can add up to 100 node group configurations.

For more information, see NodeGroups properties.

ReleaseVersion

String

Yes

No

The version of EMR.

None.

BootstrapScripts

List

No

No

The array of bootstrap action scripts.

You can add up to 10 bootstrap action scripts.

For more information, see BootstrapScripts properties.

SubscriptionConfig

Map

No

No

The subscription configurations.

This property must be specified when PaymentType is set to Subscription.

DeployMode

String

No

No

The deployment mode of applications in the cluster.

Valid values:

  • NORMAL: regular mode. A master node is deployed in the cluster.

  • HA: high availability mode. At least three master nodes are deployed in the cluster.

SecurityMode

String

No

No

The security mode of the cluster.

Valid values:

  • NORMAL: regular mode. Kerberos is not enabled.

  • KERBEROS: Kerberos mode. Kerberos is enabled.

NodeAttributes

Map

Yes

No

The basic attributes of all Elastic Compute Service (ECS) nodes in the cluster.

ClusterName

String

Yes

No

The cluster name.

The name must be 1 to 128 characters in length, and cannot start with http:// or https://. It can contain letters, digits, colons (:), underscores (_), periods (.), and hyphens (-).

PaymentType

String

No

No

The billing method.

Valid values:

  • PayAsYouGo: pay-as-you-go

  • Subscription: subscription

Tags

List

No

No

The tags of the cluster.

You can add up to 20 custom tags.

For more information, see Tags properties.

Applications syntax

"Applications": [
  {
    "ApplicationName": String
  }
]

Applications property

Property

Type

Required

Editable

Description

Constraint

ApplicationName

String

Yes

No

The application name.

None.

ApplicationConfigs syntax

"ApplicationConfigs": [
  {
    "ConfigFileName": String,
    "ApplicationName": String,
    "ConfigItemKey": String,
    "NodeGroupName": String,
    "NodeGroupId": String,
    "ConfigScope": String,
    "ConfigItemValue": String
  }
]

ApplicationConfigs properties

Property

Type

Required

Editable

Description

Constraint

ConfigFileName

String

No

No

The name of the configuration file.

None.

ApplicationName

String

Yes

No

The application name.

None.

ConfigItemKey

String

No

No

The name of the configuration item.

None.

NodeGroupName

String

No

No

The name of the node group.

This property takes effect when ConfigScope is set to NODE_GROUP and NodeGroupId is left empty.

NodeGroupId

String

No

No

The ID of the node group.

This property takes effect when ConfigScope is set to NODE_GROUP.

Note

NodeGroupId takes precedence over NodeGroupName.

ConfigScope

String

No

No

The level at which you want to apply the configurations.

Valid values:

  • CLUSTER (default): cluster level

  • NODE_GROUP: node group level

ConfigItemValue

String

No

No

The value of the configuration item.

None.

NodeGroups syntax

"NodeGroups": [
  {
    "WithPublicIp": Boolean,
    "SpotInstanceRemedy": Boolean,
    "NodeCount": Number,
    "NodeGroupName": String,
    "DataDisks": List,
    "VSwitchIds": List,
    "SpotBidPrices": List,
    "NodeResizeStrategy": String,
    "SystemDisk": Map,
    "NodeGroupType": String,
    "InstanceTypes": List,
    "AdditionalSecurityGroupIds": List,
    "CostOptimizedConfig": Map,
    "GracefulShutdown": Boolean,
    "DeploymentSetStrategy": String,
    "SpotStrategy": String
  }
]

NodeGroups properties

Property

Type

Required

Editable

Description

Constraint

WithPublicIp

Boolean

No

No

Specifies whether to assign a public IP address.

Valid values:

  • true

  • false (default)

SpotInstanceRemedy

Boolean

No

No

Specifies whether to enable preemptible instances. When the system receives a message indicating that an existing preemptible instance is about to be reclaimed, the system attempts to create an instance in the scaling group to replace the existing preemptible instance.

Valid values:

  • true

  • false (default)

NodeCount

Number

No

No

The number of nodes.

Valid values: 1 to 1000.

NodeGroupName

String

No

No

The name of the node group.

The name can be up to 128 characters in length. The name of a node group must be unique in a cluster.

DataDisks

List

No

No

The configurations of the data disks.

For more information, see DataDisks properties.

VSwitchIds

List

No

No

The IDs of the vSwitches.

None.

SpotBidPrices

List

No

No

The bid prices for preemptible instances.

This property takes effect when SpotStrategy is set to SpotWithPriceLimit.

Note

You can specify up to 100 bid prices.

For more information, see SpotBidPrices properties.

NodeResizeStrategy

String

No

No

The scaling policy for nodes.

Valid values:

  • COST_OPTIMIZED: cost optimization policy

  • PRIORITY (default): priority-based policy

SystemDisk

Map

No

No

The configurations of the system disk.

For more information, see SystemDisk properties.

NodeGroupType

String

Yes

No

The type of the node group.

Valid values:

  • MASTER

  • CORE

  • TASK

InstanceTypes

List

Yes

No

The instance types of the nodes.

You can add up to 100 instance types.

AdditionalSecurityGroupIds

List

No

No

The additional security groups.

Additional security groups are the security groups that are added to node groups instead of clusters. You can add up to two additional security groups to node groups.

CostOptimizedConfig

Map

No

No

The configurations of the cost optimization policy.

None.

GracefulShutdown

Boolean

No

No

Specifies whether to enable graceful shutdown for components in the node group.

Valid values:

  • true

  • false (default)

DeploymentSetStrategy

String

No

No

The deployment set policy.

Valid values:

  • NONE (default): does not use deployment sets.

  • CLUSTER: uses deployment sets at the cluster level.

  • NODE_GROUP: uses deployment sets at the node group level.

SpotStrategy

String

No

No

The bidding policy for the pay-as-you-go instance.

Valid values:

  • NoSpot (default): The instance is created as a regular pay-as-you-go instance.

  • SpotWithPriceLimit: The instance is created as a preemptible instance that has a user-defined maximum hourly price.

  • SpotAsPriceGo: The instance is created as a preemptible instance whose bid price is based on the market price at the time of purchase. The market price can be up to the pay-as-you-go price.

DataDisks syntax

"DataDisks": [
  {
    "Category": String,
    "PerformanceLevel": String,
    "Size": Number,
    "Count": Number
  }
]

DataDisks properties

Property

Type

Required

Editable

Description

Constraint

Category

String

Yes

No

The disk category.

None.

PerformanceLevel

String

No

No

The performance level (PL) of the enhanced SSD (ESSD) that you want to use as the data disk.

Valid values:

  • PL0: An ESSD delivers up to 10,000 random read/write IOPS.

  • PL1 (default): An ESSD delivers up to 50,000 random read/write IOPS.

  • PL2: An ESSD delivers up to 100,000 random read/write IOPS.

  • PL3: An ESSD delivers up to 1,000,000 random read/write IOPS.

Size

Number

Yes

No

The disk size.

None.

Count

Number

No

No

The number of data disks on a node.

None.

SpotBidPrices syntax

"SpotBidPrices": [
  {
    "BidPrice": Number,
    "InstanceType": String
  }
]

SpotBidPrices properties

Property

Type

Required

Editable

Description

Constraint

BidPrice

Number

No

No

The maximum hourly bid price of the instance.

The value of this property can contain up to three decimal places. This property takes effect when SpotStrategy is set to SpotWithPriceLimit.

InstanceType

String

No

No

The ECS instance type.

None.

SystemDisk syntax

"SystemDisk": {
  "Category": String,
  "PerformanceLevel": String,
  "Size": Number,
  "Count": Number
}

SystemDisk properties

Property

Type

Required

Editable

Description

Constraint

Category

String

Yes

No

The disk category.

None.

PerformanceLevel

String

No

No

The PL of the ESSD that you want to use as the system disk.

Valid values:

  • PL0: An ESSD delivers up to 10,000 random read/write IOPS.

  • PL1 (default): An ESSD delivers up to 50,000 random read/write IOPS.

  • PL2: An ESSD delivers up to 100,000 random read/write IOPS.

  • PL3: An ESSD delivers up to 1,000,000 random read/write IOPS.

Size

Number

Yes

No

The disk size.

Valid values: 20 to 500.

Count

Number

No

No

The number of system disks on a node.

Default value: 1.

CostOptimizedConfig syntax

"CostOptimizedConfig": {
  "OnDemandBaseCapacity": Number,
  "OnDemandPercentageAboveBaseCapacity": Number,
  "SpotInstancePools": Number
}

CostOptimizedConfig properties

Property

Type

Required

Editable

Description

Constraint

OnDemandBaseCapacity

Number

Yes

No

The minimum number of pay-as-you-go instances that are required in node groups.

None.

OnDemandPercentageAboveBaseCapacity

Number

Yes

No

The percentage of pay-as-you-go instances in the extra instances when the limit that is specified by OnDemandBaseCapacity is reached.

Valid values: 0 to 100.

SpotInstancePools

Number

Yes

No

The number of available instance types.

None.

BootstrapScripts syntax

"BootstrapScripts": [
  {
    "ScriptPath": String,
    "ScriptArgs": String,
    "ExecutionFailStrategy": String,
    "Priority": Number,
    "ScriptName": String,
    "ExecutionMoment": String,
    "NodeSelector": Map
  }
]

BootstrapScripts properties

Property

Type

Required

Editable

Description

Constraint

ScriptPath

String

Yes

No

The Object Storage Service (OSS) path in which the script is stored.

The path must start with oss://.

ScriptArgs

String

No

No

The runtime parameters of the script.

None.

ExecutionFailStrategy

String

No

No

The policy that you want to use to handle execution failures of the script.

Valid values:

  • FAILED_CONTINUE: After the script fails to be executed, the system continues to perform the creation or scaling operation on the cluster.

  • FAILED_BLOCK: After the script fails to be executed, the system stops performing the creation or scaling operation on the cluster.

Priority

Number

No

No

The priority of the script.

Valid values: 1 to 100.

ScriptName

String

Yes

No

The script name.

The name must be 1 to 64 characters in length, and can contain letters, digits, underscores (_), and hyphens (-). It must start with a letter and cannot start with http:// or https://.

ExecutionMoment

String

No

No

The time sequence based on which the system executes the script.

Valid values:

  • BEFORE_INSTALL: The system executes the script before the application is installed.

  • AFTER_STARTED: The system executes the script after the application is started.

NodeSelector

Map

Yes

No

The configurations of the node selector.

None.

NodeSelector syntax

"NodeSelector": {
  "NodeGroupTypes": List,
  "NodeGroupName": String,
  "NodeGroupId": String,
  "NodeSelectType": String,
  "NodeNames": List
}

NodeSelector properties

Property

Type

Required

Editable

Description

Constraint

NodeGroupTypes

List

No

No

The type of the node group.

Valid values:

  • MASTER

  • CORE

  • TASK

NodeGroupName

String

No

No

The name of the node group.

This property takes effect when NodeSelectType is set to NodeGroup and NodeGroupId is left empty.

NodeGroupId

String

No

No

The ID of the node group.

This property takes effect when NodeSelectType is set to NodeGroup.

NodeSelectType

String

Yes

No

The level at which you want to select nodes.

Valid values:

  • CLUSTER: cluster level

  • NODE_GROUP: node group level

  • NODE: node level

NodeNames

List

No

No

The names of the nodes.

This property takes effect when NodeSelectType is set to Node.

SubscriptionConfig syntax

"SubscriptionConfig": {
  "AutoRenewDurationUnit": String,
  "AutoRenew": Boolean,
  "PaymentDurationUnit": String,
  "PaymentDuration": Number,
  "AutoRenewDuration": Number
}

SubscriptionConfig properties

Property

Type

Required

Editable

Description

Constraint

AutoRenewDurationUnit

String

No

No

The unit of the auto-renewal duration.

Set the value to Month.

AutoRenew

Boolean

No

No

Specifies whether to enable auto-renewal.

Valid values:

  • true

  • false (default)

PaymentDurationUnit

String

No

No

The unit of the subscription duration.

Set the value to Month.

PaymentDuration

Number

No

No

The subscription duration.

Valid values when PaymentDurationUnit is set to Month: 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 24, 36, 48, and 60.

AutoRenewDuration

Number

No

No

The auto-renewal duration.

This property takes effect when AutoRenew is set to true. Valid values when AutoRenewDurationUnit is set to Month: 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 24, 36, 48, and 60.

NodeAttributes syntax

"NodeAttributes": {
  "KeyPairName": String,
  "VpcId": String,
  "ZoneId": String,
  "SecurityGroupId": String,
  "RamRole": String,
  "MasterRootPassword": String
}

NodeAttributes properties

Property

Type

Required

Editable

Description

Constraint

KeyPairName

String

No

No

The SSH key pair that you want to use to log on to the ECS instance.

None.

VpcId

String

Yes

No

The virtual private cloud (VPC) ID.

None.

ZoneId

String

Yes

No

The zone ID.

None.

SecurityGroupId

String

Yes

No

The ID of the security group.

EMR supports only basic security groups. EMR does not support advanced security groups.

RamRole

String

No

No

The Resource Access Management (RAM) role that you want to attach to EMR to access other Alibaba Cloud resources from ECS.

Default value: AliyunECSInstanceForEMRRole.

MasterRootPassword

String

No

No

The root password of the master node.

None.

Tags syntax

"Tags": [
  {
    "Value": String,
    "Key": String
  }
]

Tags properties

Property

Type

Required

Editable

Description

Constraint

Value

String

No

No

The tag value.

This property is optional and can be an empty string. The tag value can be up to 128 characters in length, and cannot contain http:// or https://. It cannot start with acs:.

Key

String

Yes

No

The tag key.

This property is required and cannot be an empty string. The tag key can be up to 128 characters in length, and cannot contain http:// or https://. It cannot start with aliyun or acs:.

Return values

Fn::GetAtt

  • ClusterId: the cluster ID.

  • ApplicationLinks: the links of applications in the cluster.

Examples

YAML format

ROSTemplateFormatVersion: '2015-09-01'
Parameters:
  Applications:
    Type: Json
    Description: 'Application List. The value range of the number n of the number N group: 1 ~ 100.'
  ClusterType:
    Type: String
    Description: |-
      Cluster type.Ranges:
      Datalake: The new version of the data lake.
      OLAP: Data analysis.
      DataFlow: Real -time data stream.
      DataServing: Data service.
      Hadoop: The old version of the data lake (not recommended, it is recommended to use the new version of the data lake).
    AllowedValues:
      - DATAFLOW
      - DATALAKE
      - DATASERVING
      - HADOOP
      - OLAP
  NodeGroups:
    Type: Json
    Description: 'The node group configuration array. The value range of the number n of the number N group: 1 ~ 100.'
    MinLength: 1
    MaxLength: 100
  ReleaseVersion:
    Type: String
    Description: EMR release version.View EMR distribution versions can be viewed through the EMR cluster.
  NodeAttributes:
    Type: Json
    Description: Node attributes.All ECS nodes basic attributes of the cluster.
  ClusterName:
    Type: String
    Description: 'Cluster name.The length is 1 ~ 128 characters, and the alphabet or Chinese must be started. It cannot start with http:// and https: //.It can include Chinese, English, numbers, half-horn colons (:), down line (_), half-angle period (.) Or short lines (-)'
Resources:
  Cluster:
    Type: ALIYUN::EMR::Cluster2
    Properties:
      Applications:
        Ref: Applications
      ClusterType:
        Ref: ClusterType
      NodeGroups:
        Ref: NodeGroups
      ReleaseVersion:
        Ref: ReleaseVersion
      NodeAttributes:
        Ref: NodeAttributes
      ClusterName:
        Ref: ClusterName
Outputs:
  ClusterId:
    Description: Cluster ID.
    Value:
      Fn::GetAtt:
        - Cluster
        - ClusterId

JSON format

{
  "ROSTemplateFormatVersion": "2015-09-01",
  "Parameters": {
    "Applications": {
      "Type": "Json",
      "Description": "Application List. The value range of the number n of the number N group: 1 ~ 100."
    },
    "ClusterType": {
      "Type": "String",
      "Description": "Cluster type.Ranges:\nDatalake: The new version of the data lake.\nOLAP: Data analysis.\nDataFlow: Real -time data stream.\nDataServing: Data service.\nHadoop: The old version of the data lake (not recommended, it is recommended to use the new version of the data lake).",
      "AllowedValues": [
        "DATAFLOW",
        "DATALAKE",
        "DATASERVING",
        "HADOOP",
        "OLAP"
      ]
    },
    "NodeGroups": {
      "Type": "Json",
      "Description": "The node group configuration array. The value range of the number n of the number N group: 1 ~ 100.",
      "MinLength": 1,
      "MaxLength": 100
    },
    "ReleaseVersion": {
      "Type": "String",
      "Description": "EMR release version.View EMR distribution versions can be viewed through the EMR cluster."
    },
    "NodeAttributes": {
      "Type": "Json",
      "Description": "Node attributes.All ECS nodes basic attributes of the cluster."
    },
    "ClusterName": {
      "Type": "String",
      "Description": "Cluster name.The length is 1 ~ 128 characters, and the alphabet or Chinese must be started. It cannot start with http:// and https: //.It can include Chinese, English, numbers, half-horn colons (:), down line (_), half-angle period (.) Or short lines (-)"
    }
  },
  "Resources": {
    "Cluster": {
      "Type": "ALIYUN::EMR::Cluster2",
      "Properties": {
        "Applications": {
          "Ref": "Applications"
        },
        "ClusterType": {
          "Ref": "ClusterType"
        },
        "NodeGroups": {
          "Ref": "NodeGroups"
        },
        "ReleaseVersion": {
          "Ref": "ReleaseVersion"
        },
        "NodeAttributes": {
          "Ref": "NodeAttributes"
        },
        "ClusterName": {
          "Ref": "ClusterName"
        }
      }
    }
  },
  "Outputs": {
    "ClusterId": {
      "Description": "Cluster ID.",
      "Value": {
        "Fn::GetAtt": [
          "Cluster",
          "ClusterId"
        ]
      }
    }
  }
}