This topic describes how to deploy Apache Spark on a single Elastic Compute Service (ECS) instance by creating a stack in the Resource Orchestration Service (ROS) console.

Background information

Apache Spark is a general-purpose computing engine designed for large-scale data processing. Apache Spark uses Scala as its application framework and uses resilient distributed datasets (RDDs) for in-memory computing. Apache Spark provides interactive queries and can optimize workloads by means of iterative algorithms.

The Installs Spark on an ECS instance (existing VPC) sample template in ROS helps you create an ECS instance based on existing resources, such as the virtual private cloud (VPC), vSwitch, and security group, and associate an elastic IP address (EIP) with the instance. The following software versions are used in the sample template:

  • JDK 1.8.0: the Java Development Kit (JDK).
  • Hadoop 2.7.7: the framework for distributed systems.
  • Scala 2.12.1: the programming language.
  • Apache Spark 2.1.0: the computing engine.

After a stack is created by using the sample template, you can obtain the URL of the Spark web interface and use the URL to log on to the Spark management console. If you want to access the URL of the Spark web interface over the Internet, add inbound rules to the security group to allow traffic on ports 8088 and 8080. For more information, see Add a security group rule.

Step 1: Create a stack

  1. Log on to the ROS console.
  2. In the left-side navigation pane, choose Templates > Sample Templates.
  3. Find the Installs Spark on an ECS instance (existing VPC) template.
  4. Click Create Stack.
  5. In the Configure Template Parameters step of the Use New Resources (Standard) wizard, configure Stack Name and the following parameters.
    Parameter Description Example
    Existing VPC ID The ID of the VPC.

    For more information about how to create and query a VPC, see Create and manage a VPC.

    vpc-bp1m6fww66xbntjyc****
    VSwitch Zone ID The zone ID of the vSwitch in the VPC. Hangzhou Zone K
    VSwitch ID The ID of the vSwitch in the VPC.

    For more information about how to create and query a vSwitch, see Create and manage a vSwitch.

    vsw-bp183p93qs667muql****
    Business Security Group ID The ID of the ECS security group.

    For more information about how to query the ID of a security group, see Query security groups.

    sg-bp15ed6xe1yxeycg7o****
    Instance Type The ECS instance type.

    Use a valid instance type. For more information, see Overview of instance families.

    ecs.c5.large
    Image ID The image ID of the ECS instance. By default, centos_7 is used.

    For more information, see Image overview.

    centos_7
    Instance Password The password that is used to log on to the ECS instance. Test_12****
    Public IP Bandwidth The public network bandwidth.

    Valid values: 1 to 100.

    Unit: Mbit/s.
    5
    Disk Type Valid values:
    • cloud_efficiency: ultra disk
    • cloud_ssd: standard SSD
    • cloud_essd: enhanced SSD (ESSD)
    • cloud: basic disk
    • ephemeral_ssd: local SSD

    For more information, see Disks.

    cloud_efficiency
    System Disk Space The system disk size of the ECS instance.

    Valid values: 40 to 500.

    Unit: GB.

    40
  6. Click Create.
  7. Click the name of the created stack. On the page that appears, click the Stack Information tab to view the stack status. After the stack is created, click the Outputs tab to view the URL of the Spark web interface.
  8. Use the URL to log on to the Spark management console.

Step 2: View resources

  1. Log on to the ROS console.
  2. In the left-side navigation pane, click Stacks.
  3. On the Stacks page, click the name of the stack that you created.
  4. On the page that appears, click the Resources tab to view resources.
    The following table describes the resource in this example.
    Resource Quantity Description Specifications
    ALIYUN::ECS::Instance 1 Creates an ECS instance to deploy Apache Spark.
    • Quantity: 1.
    • Instance type: ecs.c5.large.
    • Disk category: ultra disk.
    • System disk size: 40 GB.
    • Public IP: A public IP address is allocated.
    Note For more information about the resource charges, see the pricing schedule on the official website or the product pricing documentation.