This topic describes how to deploy a Spark cluster by creating a stack in the Resource Orchestration Service (ROS) console.

Background information

Apache Spark is a general-purpose computing engine designed for large-scale data processing. Apache Spark uses Scala as its application framework and uses resilient distributed datasets (RDDs) for in-memory computing. It provides interactive queries and can optimize workloads by means of iterative algorithms.

The Deploy a Spark Cluster in an Existing VPC sample template creates multiple Elastic Compute Service (ECS) instances based on existing resources such as a virtual private cloud (VPC), a vSwitch, and a security group. One of the created ECS instances is associated with an elastic IP address (EIP) to serve as a management node. Other ECS instances are managed by using Auto Scaling. The following software versions are used in the sample template:

  • JDK: 1.8.0
  • Hadoop: 2.7.7
  • Scala: 2.12.1
  • Kafka: 2.1.0

After a stack is created by using the sample template, you can obtain the URL of the Spark web interface and log on to the Spark management console. To access the URL of the Spark web interface over the Internet, add inbound rules to the security group to allow traffic on port 8080. For more information, see Add a security group rule.

Step 1: Create a stack

  1. Log on to the ROS console.
  2. In the left-side navigation pane, click Solution Center.
  3. Find the Deploy a Spark Cluster in an Existing VPC template.
  4. Click Create Stack.
  5. In the Configure Template Parameters step, set Stack Name and the following parameters.
    ParameterDescriptionExample
    Existing VPC Instance IDThe ID of the VPC.

    For more information about how to create and query a VPC, see Create and manage a VPC.

    vpc-bp1m6fww66xbntjyc****
    VSwitch Zone IDThe zone ID of the vSwitch in the VPC. Hangzhou Zone K
    VSwitch IDThe ID of the vSwitch in the VPC.

    For more information about how to create and query a vSwitch, see Create and manage a vSwitch.

    vsw-bp183p93qs667muql****
    Business Security Group IDThe ID of the ECS security group.

    For more information about how to query security groups, see Query security groups.

    sg-bp15ed6xe1yxeycg7o****
    Instance TypeThe instance type of the ECS instance.

    Select a valid instance type. For more information, see Overview of instance families.

    ecs.c5.large
    Instance PasswordThe password that is used to log on to the ECS instance. Test_12****
    Public IP BandwidthThe public IP bandwidth.

    Unit: Mbit/s.

    5
    Disk TypeThe disk category of the ECS instance. Valid values:
    • cloud_efficiency: ultra disk
    • cloud_ssd: standard SSD

    For more information, see Disks.

    cloud_efficiency
    System Disk SpaceThe system disk size of the ECS instance.

    Valid values: 20 to 500.

    Unit: GB.

    40
    Instance AmountThe number of ECS instances in the Spark cluster.

    Valid values: 3 to 10.

    3
  6. Click Create.
  7. View the stack status on the Stack Information tab of the stack management page. After the stack is created, click the Outputs tab to view the URL of the Spark web interface.
  8. Use the URL to log on to the Spark management console.

Step 2: View resources

  1. Log on to the ROS console.
  2. In the left-side navigation pane, click Stacks.
  3. On the Stacks page, click the stack that you created.
  4. On the stack management page, click the Resources tab to view the resource list.
    The following table describes the resources in this example.
    Resource typeQuantityDescriptionSpecifications
    ALIYUN::ECS::Instance1Creates an ECS instance to deploy the Spark primary service. A single instance of the following specifications is created:
    • InstanceType: ecs.c5.large
    • SystemDiskCategory: cloud_efficiency
    • SystemDiskSize: 40
    • AllocatePublicIP: true
    ALIYUN::ESS::ScalingGroup2Creates two scaling groups to deploy the Spark secondary service.

    Scaling groups scale computing resources based on the scaling rules that you set to meet your business requirements.

    Two instances of the following specifications are created:
    • InstanceType: ecs.c5.large
    • SystemDiskCategory: cloud_efficiency
    • SystemDiskSize: 40
    • AllocatePublicIP: true
    ALIYUN::RAM::Role1Creates a Resource Access Management (RAM) role to issue a Security Token Service (STS) token that is valid within a temporary period. This is a more secure method to grant access permissions. None
    ALIYUN::VPC::EIP1Creates an EIP to associate the EIP with an ECS instance. This way, the instance can be accessed over the Internet. None
    ALIYUN::OOS::Template2Creates two Operation Orchestration Service (OSS) templates to create lifecycle hooks.

    For more information about lifecycle hooks, see Lifecycle hooks.

    None
    Note For more information about the resource charges, see the pricing schedule on the official website or the product pricing documentation.