Run a Cloud Assistant command to stop or restart instances - Elastic Compute Service

This topic describes how to stop or restart Elastic Compute Service (ECS) instances by running a Cloud Assistant command.

Prerequisites

The instances that you want to stop or restart are in the Running (Running) state.
Cloud Assistance Agent is installed on the instances. For more information, see Install Cloud Assistant Agent.

(Recommended) Use an exit code to stop or restart instances

When you run a Cloud Assistant command to stop or restart instances, we recommend that you append a specific exit code to the end of the command to ensure real-time accuracy in the execution status of the command. If you run a Cloud Assistant command without an exit code to stop or restart instances, the execution status of the command may not be correctly updated even if the command finishes with the stop or restart operation. This occurs because Cloud Assistant Agent does not save the execution status of the command before the command is run to stop or restart the instances.

Important

Make sure that the installed version of Cloud Assistant Agent is not earlier than the following versions:

Linux: 2.2.3.317
Windows: 2.1.3.317

If an error is reported when you run a Cloud Assistant command with an exit code, upgrade Cloud Assistant Agent to the latest version. For more information, see Upgrade or disable upgrades for Cloud Assistant Agent.

Log on to the ECS console.
In the left-side navigation pane, choose Maintenance & Monitoring > Cloud Assistant.
In the top navigation bar, select the region and resource group to which the resource belongs.
In the upper-right corner of the ECS Cloud Assistant page, click Create/Run Command.
In the Command Information section, configure the parameters. For more information, see Create and run a Cloud Assistant command.

In the Command content code editor, add an exit code to the end of the command script.

To stop instances by running a command, specify one of the exit codes in the following table based on the operating system type of the instances.

Operating system

Exit code

Sample command

Linux

193

# If the following shell command returns an exit code of 193, an operation is triggered to stop the specified instances.
exit 193

Windows

3009

# If the following PowerShell command returns an exit code of 3009, an operation is triggered to stop the specified instances.
exit 3009

To restart instances by running a command, specify one of the exit codes in the following table based on the operating system type of the instances.

Operating system

Exit code

Sample command

Linux

194

# If the following shell command returns an exit code of 194, an operation is triggered to restart the specified instances.
exit 194

Windows

3010

# If the following PowerShell command returns an exit code of 3010, an operation is triggered to restart the specified instances.
exit 3010

In the Select Instance or Select Managed Instances section, select the instances on which you want to run the command.
Note
A managed instance is an instance that is not provided by Alibaba Cloud but is managed by Cloud Assistant. For more information, see Alibaba Cloud managed instances.
Click Run and Save or Run to immediately run the command.

Call API operations to run a Cloud Assistant command to batch restart instances

Alibaba Cloud provides a variety of API operations for you to manage your cloud resources. This section describes how to call API operations by running Python code in an on-premises Linux environment to run a Cloud Assistant command to batch restart instances.

Prepare the information required to run a Cloud Assistant command.
1. Obtain an AccessKey pair.
  We recommend that you obtain the AccessKey pair of a Resource Access Management (RAM) user. For more information, see Create an AccessKey pair.
2. Obtain the region ID of the instances on which you want to run the Cloud Assistant command.
  You can call the DescribeRegions operation to query the most recent region list. For information about the parameters in the DescribeRegions operation, see DescribeRegions.
3. Obtain the IDs of the instances on which you want to run the Cloud Assistant command.
  You can call the DescribeInstances operation to query the instances that meet specific conditions. For example, you can query the instances that are in the Running state or have specific tags added. For information about the parameters in the DescribeInstances operation, see DescribeInstances.

Configure the on-premises environment and run the sample code.

Install Alibaba Cloud ECS SDK for Python.
```
sudo pip install aliyun-python-sdk-ecs
```

Update ECS SDK for Python to the latest version.

sudo pip install --upgrade aliyun-python-sdk-ecs

Create a .py file and write the following sample code to the file.

Replace the following parameters in the sample code with the actual values obtained in the preceding step:

AccessKey ID:
access_key = os.environ['ALIBABA_CLOUD_ACCESS_KEY_ID']
AccessKey secret:
access_key_secret = os.environ['ALIBABA_CLOUD_ACCESS_KEY_SECRET']
Region ID:
region_id = '<yourRegionId>'
Instance IDs:
ins_ids= ["i-bp185fcs****","i-bp14wwh****","i-bp13jbr****"]

Sample code:

# coding=utf-8
# If ECS SDK for Python is not installed, run the sudo pip install aliyun-python-sdk-ecs command.
# Make sure that you use the latest version of ECS SDK for Python.
# Run the sudo pip install --upgrade aliyun-python-sdk-ecs command to upgrade the version of ECS SDK for Python.

import json
import sys
import base64
import time
import logging
import os
from aliyunsdkcore.client import AcsClient
from aliyunsdkcore.acs_exception.exceptions import ClientException
from aliyunsdkcore.acs_exception.exceptions import ServerException
from aliyunsdkecs.request.v20140526.RunCommandRequest import RunCommandRequest
from aliyunsdkecs.request.v20140526.DescribeInvocationResultsRequest import DescribeInvocationResultsRequest
from aliyunsdkecs.request.v20140526.RebootInstancesRequest import RebootInstancesRequest
from aliyunsdkecs.request.v20140526.DescribeInstancesRequest import DescribeInstancesRequest

# Configure the log output formatter.
logging.basicConfig(level=logging.INFO,
                    format="%(asctime)s %(name)s [%(levelname)s]: %(message)s",
                    datefmt='%m-%d %H:%M')

logger = logging.getLogger()

# Make sure that the ALIBABA_CLOUD_ACCESS_KEY_ID and ALIBABA_CLOUD_ACCESS_KEY_SECRET environment variables are configured in the code runtime environment. 
# If the project code is leaked, the AccessKey pair may be leaked and the security of all resources in your account may be compromised. The following sample code is only for reference. We recommend that you use Security Token Service (STS) tokens, which provide higher security. 
access_key = os.environ['ALIBABA_CLOUD_ACCESS_KEY_ID']  
access_key_secret = os.environ['ALIBABA_CLOUD_ACCESS_KEY_SECRET']  
region_id = '<yourRegionId>'  # Specify the region ID that you obtained.

client = AcsClient(access_key, access_key_secret, region_id)

def base64_decode(content, code='utf-8'):
    if sys.version_info.major == 2:
        return base64.b64decode(content)
    else:
        return base64.b64decode(content).decode(code)

def get_invoke_result(invoke_id):
    request = DescribeInvocationResultsRequest()
    request.set_accept_format('json')

    request.set_InvokeId(invoke_id)
    response = client.do_action_with_exception(request)
    response_details = json.loads(response)["Invocation"]["InvocationResults"]["InvocationResult"]
    dict_res = { detail.get("InstanceId",""):{"status": detail.get("InvocationStatus",""),"output":base64_decode(detail.get("Output",""))}  for detail in response_details }
    return dict_res

def get_instances_status(instance_ids):
    request = DescribeInstancesRequest()
    request.set_accept_format('json')
    request.set_InstanceIds(instance_ids)
    response = client.do_action_with_exception(request)
    response_details = json.loads(response)["Instances"]["Instance"]
    dict_res = { detail.get("InstanceId",""):{"status":detail.get("Status","")} for detail in response_details }
    return dict_res

def run_command(cmdtype,cmdcontent,instance_ids,timeout=60):
    """
    cmdtype: the type of the command. Valid values: RunBatScript, RunPowerShellScript, and RunShellScript.
    cmdcontent: the content of the command.
    instance_ids: the IDs of the instances on which you want to run the command.
    """
    try:
        request = RunCommandRequest()
        request.set_accept_format('json')

        request.set_Type(cmdtype)
        request.set_CommandContent(cmdcontent)
        request.set_InstanceIds(instance_ids)
        # The timeout period for running the command. Unit: seconds. Default value: 60. Specify this parameter based on the command that you want to run.
        request.set_Timeout(timeout)
        response = client.do_action_with_exception(request)
        invoke_id = json.loads(response).get("InvokeId")
        return invoke_id
    except Exception as e:
        logger.error("run command failed")

def reboot_instances(instance_ids,Force=False):
    """
    instance_ids: the IDs of the instances that you want to restart.
    Force: specifies whether to forcefully restart the instances. Default value: False.
    """
    request = RebootInstancesRequest()
    request.set_accept_format('json')
    request.set_InstanceIds(instance_ids)
    request.set_ForceReboot(Force)
    response = client.do_action_with_exception(request)

def wait_invoke_finished_get_out(invoke_id,wait_count,wait_interval):
    for i in range(wait_count):
        result = get_invoke_result(invoke_id)
        if set([res["status"] for _,res in result.items()]) & set(["Running","Pending","Stopping"]):
            time.sleep(wait_interval)
        else:
            return result
    return result

def wait_instance_reboot_ready(ins_ids,wait_count,wait_interval):
    for i in range(wait_count):
        result = get_instances_status(ins_ids)
        if set([res["status"] for _,res in result.items()]) != set(["Running"]):
            time.sleep(wait_interval)
        else:
            return result
    return result

def run_task():
    # Specify the type of the Cloud Assistant command.
    cmdtype = "RunShellScript"
    # Specify the content of the Cloud Assistant command.
    cmdcontent = """
    #!/bin/bash
    echo helloworld
    """
    # Specify the timeout period for running the command.
    timeout = 60
    # Specify the IDs of the instances on which you want to run the command.
    ins_ids= ["i-bp185fcs****","i-bp14wwh****","i-bp13jbr****"]

    # Run the command.
    invoke_id = run_command(cmdtype,cmdcontent,ins_ids,timeout)
    logger.info("run command,invoke-id:%s" % invoke_id)

    # Wait for the command to be run. Query the execution status of the command 10 times at an interval of 5 seconds. You can specify the number of times the command execution status is queried and the query interval based on your business requirements.
    invoke_result = wait_invoke_finished_get_out(invoke_id,10,5)
    for ins_id,res in invoke_result.items():
        logger.info("instance %s command execute finished,status: %s,output:%s" %(ins_id,res["status"],res["output"]))

    # Restart the instances.
    logger.warn("reboot instance Now")
    reboot_instances(ins_ids)

    time.sleep(5)
    # Wait for the instances to enter the Running state. Query the status of the instance 30 times at an interval of 10 seconds.
    reboot_result = wait_instance_reboot_ready(ins_ids,30,10)
    logger.warn("reboot instance Finished")
    for ins_id,res in reboot_result.items():
        logger.info("instance %s status: %s" %(ins_id,res["status"]))

if __name__ == '__main__':
    run_task()

Run the .py file.
The following figure shows a sample result after the .py file is run. In this example, a command is run on three instances and helloworld is returned, and then the three instances are restarted.

Use OOS to run a Cloud Assistant command to batch restart instances

CloudOps Orchestration Service (OOS) is an automated O&M service provided by Alibaba Cloud. You can use OOS templates to configure and execute O&M tasks.

Go to the Create Template page.
1. Log on to the OOS console.
2. In the left-side navigation pane, choose Automated Task > Custom Template.
3. Click Create Template.

Configure the parameters.

Click the YAML tab and enter the following code.

Sample code:

FormatVersion: OOS-2019-06-01
Description:
  en: Runs a Cloud Assistant command to batch restart multiple ECS instances.
   
  name-en: ACS-ECS-BulkyRunCommandRboot
   
  categories:
    - run_command
Parameters:
  regionId:
    Type: String
    Description:
      en: The region ID
       
    Label:
      en: Region
       
    AssociationProperty: RegionId
    Default: '{{ ACS::RegionId }}'
  targets:
    Type: JSON
    Label:
      en: TargetInstance
       
    AssociationProperty: Targets
    AssociationPropertyMetadata:
      ResourceType: ALIYUN::ECS::Instance
      RegionId: regionId
  commandType:
    Description:
      en: The type of the Cloud Assistant command
       
    Label:
      en: CommandType
       
    Type: String
    AllowedValues:
      - RunBatScript
      - RunPowerShellScript
      - RunShellScript
    Default: RunShellScript
  commandContent:
    Description:
      en: The content of the command that you want to run on the ECS instances
       
    Label:
      en: CommandContent
       
    Type: String
    MaxLength: 16384
    AssociationProperty: Code
    Default: echo hello
  workingDir:
    Description:
      en: 'The directory where the created Cloud Assistant command runs on the ECS instances. For Linux instances, the default directory is under the home directory of the administrator (root user), which is /root. For Windows instances, the default directory is under the directory where the process of Cloud Assistant Agent is located, such as C:\Windows\System32.'
         
    Label:
      en: WorkingDir
       
    Type: String
    Default: ''
  timeout:
    Description:
      en: The timeout period for running the command on the ECS instances
       
    Label:
      en: Timeout
       
    Type: Number
    Default: 600
  enableParameter:
    Description:
      en: Whether to include secret parameters or custom parameters in the command.
       
    Label:
      en: EnableParameter
       
    Type: Boolean
    Default: false
  username:
    Description:
      en: The username that is used to run the command on the ECS instances
       
    Label:
      en: Username
       
    Type: String
    Default: ''
  windowsPasswordName:
    Description:
      en: The name of the password used to run the command on a Windows instance
       
    Label:
      en: WindowsPasswordName
       
    Type: String
    Default: ''
    AssociationProperty: SecretParameterName
  rateControl:
    Description:
      en: The concurrency ratio of task execution
       
    Label:
      en: RateControl
       
    Type: JSON
    AssociationProperty: RateControl
    Default:
      Mode: Concurrency
      MaxErrors: 0
      Concurrency: 10
  OOSAssumeRole:
    Description:
      en: The RAM role to be assumed by OOS
       
    Label:
      en: OOSAssumeRole
       
    Type: String
    Default: OOSServiceRole
RamRole: '{{ OOSAssumeRole }}'
Tasks:
  - Name: getInstance
    Description:
      en: Obtains the ECS instances.
       
    Action: ACS::SelectTargets
    Properties:
      ResourceType: ALIYUN::ECS::Instance
      RegionId: '{{ regionId }}'
      Filters:
        - '{{ targets }}'
    Outputs:
      instanceIds:
        Type: List
        ValueSelector: Instances.Instance[].InstanceId
  - Name: runCommand
    Action: ACS::ECS::RunCommand
    Description:
      en: Runs the Cloud Assistant command.
       
    Properties:
      regionId: '{{ regionId }}'
      commandContent: '{{ commandContent }}'
      instanceId: '{{ ACS::TaskLoopItem }}'
      commandType: '{{ commandType }}'
      workingDir: '{{ workingDir }}'
      timeout: '{{ timeout }}'
      enableParameter: '{{ enableParameter }}'
      username: '{{ username }}'
      windowsPasswordName: '{{ windowsPasswordName }}'
    Loop:
      RateControl: '{{ rateControl }}'
      Items: '{{ getInstance.instanceIds }}'
      Outputs:
        commandOutputs:
          AggregateType: Fn::ListJoin
          AggregateField: commandOutput
    Outputs:
      commandOutput:
        Type: String
        ValueSelector: invocationOutput
  - Name: rebootInstance
    Action: ACS::ECS::RebootInstance
    Description:
      en: Restarts the ECS instances.
       
    Properties:
      regionId: '{{ regionId }}'
      instanceId: '{{ ACS::TaskLoopItem }}'
    Loop:
      RateControl: '{{ rateControl }}'
      Items: '{{ getInstance.instanceIds }}'
Outputs:
  instanceIds:
    Type: List
    Value: '{{ getInstance.instanceIds }}'

Click Create Template.
In the dialog box that appears, enter a template name and click OK. In this example, the template name is runcommand_reboot_instances.

Execute the template.
1. Find the template that you created and click Create Execution in the Actions column.
2. Configure the execution.
  Complete the execution configurations as prompted. In the Parameter Settings step, set the TargetInstance parameter to Manually Select Instances and select multiple instances. Use the default values for other parameters.
3. In the OK step, click Create.
  After the execution is created, the execution starts and you are automatically directed to the Basic Information tab on the execution details page. If the template is executed, Success is displayed next to Execution Status.
View the execution procedure and the details of each task node.
1. In the Execution Steps and Results section, click View Execution Flowchart to view the execution process.
2. Click the Execute cloud assistant command step and then click the circular Task List tab to view the execution details of each task node. The following figure shows that the operations specified in the template are performed.