The deep learning solution supports container clusters with Elastic Compute Service (ECS) instances or GPU instances. This document uses container clusters with GPU instances as an example.

Note
For how to create a container cluster with ECS instances, see Create a cluster.

Limits

  • Currently, Container Service only supports creating clusters with GN4 GPU instances in the following regions: China South 1 (Shenzhen), China East 2 (Shanghai), China North 2 (Beijing), and US West 1 (Silicon Valley).
  • Currently, GN4 GPU instances only support Virtual Private Cloud (VPC).

Prerequisites

Currently, the Pay-As-You-Go GPU Compute Type GN4 instances need to be activated by opening an ECS ticket as follows:

I want to activate the Pay-As-You-Go GPU Compute Type GN4 instances. Thank you!

Procedure

  1. Log on to the Container Service console.
  2. Click Swarm > Clusters in the left-side navigation pane and click Create Cluster in the upper-right corner.


  3. Complete the following configurations. In this example, create a cluster named EGS-cluster in the region China South 1 (Shenzhen).


    • Cluster Name : The name of the cluster to be created. It can be 1–64 characters long and contain numbers, Chinese characters, English letters, and hyphens (-).
      Note
      The cluster name must be unique under the same account and the same region.
    • Region: Select the region in which the cluster will be deployed. Select China South 1 (Shenzhen), China East 2 (Shanghai), China North 2 (Beijing), or US West 1 (Silicon Valley)
      Note
      Currently, Container Service only supports creating clusters with GN4 GPU instances in the following regions: China South 1 (Shenzhen), China East 2 (Shanghai), China North 2 (Beijing), and US West 1 (Silicon Valley).
    • Zone: Select the zone for the cluster.
      Note
      You can select the region and zone according to the distribution of your servers.
  4. Select VPC as the Network Type and complete the configurations.


    VPC enables you to build an isolated network environment based on Alibaba Cloud. You can have a full control over your own virtual network, including a free IP address range, Classless Inter-Domain Routing (CIDR) block division, and the configurations of route table and gateway.

    Specify a VPC, a VSwitchId, and the initial CIDR block of a container (the subnet CIDR block where the Docker container belongs. For ease of IP management, each virtual machine container belongs to a different CIDR block, and container subnet CIDR block cannot conflict with virtual machine CIDR block).

    We recommend that you build your own VPC/VSwitchId for the container cluster to prevent issues such as network conflicts.

  5. Select whether to add nodes or not.


    You can create a cluster with several new instances, or create a zero-node cluster and then add existing instances to the cluster. For how to add existing instances to the cluster, see Add an existing instance.

    • Add
      1. Select the operating system for the node.


        Currently, the supported operating systems include Ubuntu 14.04 64bit and CentOS 7.4 64bit.

      2. Configure the instance specifications.
        • Select Generation III as the Instance Generation, GPU Compute Type gn4 as the Instance Family,
        • and 32-core, 48 GB (ecs.gn4.8xlarge) or 56-core, 96 GB (ecs.gn4.14xlarge) as the Instance Type.
          Note
          If you have been approved to use the GN4 GPU instances but cannot find these two instance types, this is because no resource is currently available for instances of these two types. We recommend that you purchase the instances again later or the next day.


        You can configure the instance quantity, data disk capacity (the GPU instance has a 20 GB system disk by default), and logon password.

        Note
        • The data disk is attached to the /var/lib/docker directory and used for the storage of Docker images and containers if you select the Attach Data Disk check box.
        • In terms of performance and management, we recommend that you attach an independent data disk to the host and manage the persistent data in the container by using Docker volumes.
    • Do not Add

      You can click Add Existing Instance to add existing instances to the cluster, or click Add Existing Instances on the Cluster List page to add existing instances to the cluster after the cluster is created.

  6. Select whether to configure public Elastic IP (EIP) or not.

    If you select VPC as the network type, Container Service configures an EIP for each instance under the VPC by default. If this is not required, select the Do not Configure Public EIP  check box and then configure the SNAT gateway.



  7. Select whether to create a Server Load Balancer instance or not.


    The Automatically Create Server Load Balancer check box is selected by default. With this check box selected, an Internet Server Load Balancer instance is created after the cluster is created. You can access the container applications in the cluster by using this Server Load Balancer instance. This is a Pay-As-You-Go Server Load Balancer instance.

  8. Click Create Cluster.

Subsequent operations

On the Cluster List page, you can click View Logs at the right of the cluster to view the creation process logs of the cluster.