edit-icon download-icon

Node management

Last Updated: Nov 03, 2017

If you do not have an E-HPC cluster, Create an E-HPC cluster first.

The E-HPC console provides functions for E-HPC cluster nodes, including node query, restart, reset, and release.

Go to the node control page

Go to the E-HPC console, and click Node from the left-side navigation pane.

Select a node

To select the node to adjust, follow these steps:

  1. Select a region.

    For more information about regions, see Regions and zones.

    A node inherits the region of the cluster which the node belongs to. Use the region to find the cluster of the node you want.

  2. Select a cluster.

    Click the drop-down list next to Cluster and find the name of the cluster that contains the node you want.

  3. Select a node type

    E-HPC clusters mainly contains the following nodes:

    • Control nodes, including a scheduling server and a domain account server

      • Scheduling server: This server is used to run scheduling tools (such as PBS or SLURM) and handle job submission, scheduling management, and so on.

      • Domain account server: This server provides centralized management for the user accounts of the E-HPC cluster.

    • Computing node

      • This server runs high-performance computing jobs. Its configurations determine the overall performance of the E-HPC cluster.
    • Logon node

      • This is the only node that normal E-HPC cluster users can operate on. You can perform software debugging, compilation, installation, and job submission on the logon node.

Node management

After selecting the node region, cluster, and node type, find the node to adjust in the node information table and select required operation in the Operations column for this node.

Restart a node

Note: Unless in special situations (such as fault repair), do not restart a node in the Running status.

Click Restart and select Normal Restart or Force Restart in the pop-up window.

  • Normal restart: Remotely sends the restart command to the operating system running on the node (such as CentOS), and then the operating system terminates all processes and restarts the system. This is equivalent to using Ctrl+Alt-Del to restart a physical machine.

  • Force restart: Directly restarts the instance running at the node. It is equivalent to pressing the Reset button on a physical machine. Generally, the force restart method is only used if normal restart is ineffective.

Reset a node

Note: Unless in special situations (such as fault repair), do not reset a node in the Running status.

From the More drop-down list, click Reset Node. Once you confirm, the node reset process is triggered.

Procedure:

  1. The current node is restored to its initial state at the time of ECS instance creation.

  2. The E-HPC control system is reinstalled and initialized, including scheduler configuration or domain account management configuration.

  3. The E-HPC cluster software stack is reinstalled.

  4. The node is added to the E-HPC cluster.

If node restarting cannot repair the fault, try to reset the node.

Delete a node

Note: Only Computing Nodes can be removed from a cluster. Control Nodes and Logon Nodes cannot be deleted. In addition, deleting computing nodes is currently the only way to resize the cluster. For information on adding computing nodes, see Resize a cluster.

From the More drop-down list, click Delete Node. Once you confirm, the node is stopped and released.

Thank you! We've received your feedback.