All Products
Search
Document Center

Configure an auto scaling policy

Last Updated: Sep 18, 2021

Elastic High Performance Computing (E-HPC) provides the auto scaling feature that can dynamically allocate compute nodes based on the configured auto scaling policy. The system can automatically add or remove compute nodes based on real-time loads.

Background information

The auto scaling feature provides the following benefits:

  • Reduces the number of compute nodes to save costs without compromising cluster availability.

  • Adds compute nodes based on real-time loads of your cluster to improve cluster availability.

  • Stops faulty nodes and creates nodes to improve fault tolerance.

The auto scaling feature has the following limits:

  • The operating system of all nodes in the cluster must be Linux.

  • The scheduler must be PBS, Slurm, or Deadline.

Procedure

  1. Log on to the E-HPC console.

  2. In the top navigation bar, select a region.

  3. In the left-side navigation pane, choose Elasticity > Auto Scale.

  4. From the Cluster drop-down list on the Auto Scale page, select the cluster for which you want to configure the auto scaling policy.

  5. In the Global Configurations section, set the required parameters.

    Parameter

    Description

    Enable Autoscale

    Enable Auto Grow and Auto Shrink for all queues in a cluster.

    Note

    If the settings in the Queue Configuration section are different from the settings in the Global Configurations section, the former prevail.

    Compute Nodes

    The range of only the compute nodes that can be added to scale out the cluster. The upper limit is the sum of the maximum number of compute nodes configured for each queue in the cluster. The lower limit is the sum of the minimum number of compute nodes configured for each queue in the cluster.

    Scale-in time (Minute)

    If the continuous idle duration of a compute node exceeds the scale-in duration, the node is released.

    The continuous idle duration is the scale-in time interval multiplied by the number of consecutive idle times. By default, the scale-in interval is 2 minutes. The consecutive idle times of a compute node is the number of consecutive times that the compute node is idle during the resource scale-in check.

    Image Type

    The image type of the scale-out compute nodes to the cluster. Only the images that are compatible with the image of the original compute nodes in the cluster are supported. Valid Values:

    • Public Image

    • Custom Image

    • Shared Image

    Exceptional Nodes

    Select the nodes that you want to exclude from auto scaling.

    If you want to retain a compute node, you can set the node as an exceptional node. Then, the node is not released if it is idle.

  6. In the Queue Configuration section, click Edit to set the required parameters.

    Parameter

    Description

    Queue Name

    The queue name of the scale-out compute nodes.

    Queue Compute Nodes

    The range of the number of compute nodes in the queue. Valid values:

    • Maximum Nodes: The maximum number of compute nodes that can be added ranges from 0 to 500.

    • Minimal Nodes: The minimum number of compute nodes that can be retained ranges from 0 to 50.

    Auto Grow and Auto Shrink

    Specifies whether to enable Auto Grow and Auto Shrink. By default, both switches are turned off.

    Note

    If the settings in the Queue Configuration section are different from the settings in the Global Configurations section, the former prevail.

    Hostname Prefix

    The identifier that is used to distinguish between different queue nodes.

    Image Type

    The image type of the scale-out nodes in a single queue. You can specify different image types for different queues. Valid Values:

    • Public Image

    • Custom Image

    • Shared Image

    Image ID

    The ID of the image to which the scale-out nodes in a single queue belongs. You can specify different image IDs for different queues.

    Note

    This parameter is valid only for the current queue. If the image type or image ID is unspecified, the image type of the scale-out nodes is the same as that specified in the global configurations. If the image type is unspecified in the global configurations, the image type of the scale-out nodes is the same as the default image type of the cluster.

    Configuration List

    Each configuration list includes the configurations of the scale-out compute nodes. The following configurations are displayed in this section:

    • Zone: a zone in the region where the cluster resides.

    • vSwitch ID: the vSwitch bound to the VPC of the cluster in the selected zone.

    • Instance Type: the instance type of the scale-out compute nodes in a single queue.

    • Bid Strategy: the bidding method configured for the scale-out nodes.

    • Maximum Price per Hour: You must set a maximum hourly price only when Bid Strategy is set to Preemptible instance with maximum bid price.

  7. Read and select Alibaba Cloud International Website Product Terms of Service, and click OK.

  8. Optional. View the auto scaling diagram of the cluster.

    The auto scaling diagram shows the changes in the number of nodes over time during the auto scaling process based on the auto scaling policy that you configured. The diagram also shows the time consumed by node scale-in and scale-out at key points in time.

    Note

    You can set the number of simulated concurrent nodes in the auto scaling diagram to simulate the changes of compute nodes during auto scaling.