If you want to restart an Elastic Compute Service (ECS) instance after you run a Cloud Assistant command on the instance, we recommend that you do not add the reboot or shutdown operation to the Cloud Assistant command. Otherwise, Cloud Assistant cannot report the command execution results and the command is in an abnormal state. This topic describes how to call API operations and use Operation Orchestration Service (OOS) to batch run Cloud Assistant commands and restart ECS instances. You can choose an appropriate method based on your needs.

Call API operations to batch run Cloud Assistant commands and restart instances

Alibaba Cloud provides a variety of API operations for you to manage your cloud resources. This section describes how to run Python code to call API operations in an on-premises Linux environment to batch run Cloud Assistant commands and restart instances.

  1. Prepare information required to run Cloud Assistant commands.
    1. Obtain an AccessKey pair.
      We recommend that you obtain the AccessKey pair of a RAM user. For more information, see Create an AccessKey pair.
    2. Obtain the region ID of the instances on which you want to run the commands.
      You can call the DescribeRegions operation to query the most recent region list. For information about parameters in DescribeRegions, see DescribeRegions.
    3. Obtain the IDs of the instances on which you want to run the commands.
      You can call the DescribeInstances operation to query the list of instances that meet specific filter conditions. For example, you can query the list of instances that are in the Running state or have specific tags added. For information about parameters in DescribeInstances, see DescribeInstances.
  2. Configure the on-premises environment and run the sample code.
    1. Install Alibaba Cloud ECS SDK for Python.
      sudo pip install aliyun-python-sdk-ecs
    2. Upgrade ECS SDK for Python to the latest version.
      sudo pip install --upgrade aliyun-python-sdk-ecs
    3. Create a .py file and write the following sample code to the file.
      • Replace <yourAccessKey ID> in access_key = '<yourAccessKey ID>' with the AccessKey ID obtained in the preceding step.
      • Replace <yourAccessKey Secret> in access_key_secret = '<yourAccessKey Secret>' with the AccessKey secret obtained in the preceding step.
      • Replace <yourRegionId> in region_id = '<yourRegionId>' with the region ID obtained in the preceding step.
      • Specify the instance IDs obtained in the preceding step in the specified format. Example: ins_ids= ["i-bp185fcs****","i-bp14wwh****","i-bp13jbr****"].
      Sample code:
      # coding=utf-8
      # If the Python sdk is not installed, run 'sudo pip install aliyun-python-sdk-ecs'.
      # Make sure you're using the latest sdk version.
      # Run 'sudo pip install --upgrade aliyun-python-sdk-ecs' to upgrade.
      
      import json
      import sys
      import base64
      import time
      import logging
      from aliyunsdkcore.client import AcsClient
      from aliyunsdkcore.acs_exception.exceptions import ClientException
      from aliyunsdkcore.acs_exception.exceptions import ServerException
      from aliyunsdkecs.request.v20140526.RunCommandRequest import RunCommandRequest
      from aliyunsdkecs.request.v20140526.DescribeInvocationResultsRequest import DescribeInvocationResultsRequest
      from aliyunsdkecs.request.v20140526.RebootInstancesRequest import RebootInstancesRequest
      from aliyunsdkecs.request.v20140526.DescribeInstancesRequest import DescribeInstancesRequest
      
      # Configure the log output formatter
      logging.basicConfig(level=logging.INFO,
                          format="%(asctime)s %(name)s [%(levelname)s]: %(message)s",
                          datefmt='%m-%d %H:%M')
      
      logger = logging.getLogger()
      
      access_key = '<yourAccessKey ID>'  # The AccessKey ID you obtained.
      access_key_secret = '<yourAccessKey Secret>'  # The AccessKey secret you obtained.
      region_id = '<yourRegionId>'  # The region ID you obtained.
      
      client = AcsClient(access_key, access_key_secret, region_id)
      
      def base64_decode(content, code='utf-8'):
          if sys.version_info.major == 2:
              return base64.b64decode(content)
          else:
              return base64.b64decode(content).decode(code)
      
      def get_invoke_result(invoke_id):
          request = DescribeInvocationResultsRequest()
          request.set_accept_format('json')
      
          request.set_InvokeId(invoke_id)
          response = client.do_action_with_exception(request)
          response_details = json.loads(response)["Invocation"]["InvocationResults"]["InvocationResult"]
          dict_res = { detail.get("InstanceId",""):{"status": detail.get("InvocationStatus",""),"output":base64_decode(detail.get("Output",""))}  for detail in response_details }
          return dict_res
      
      def get_instances_status(instance_ids):
          request = DescribeInstancesRequest()
          request.set_accept_format('json')
          request.set_InstanceIds(instance_ids)
          response = client.do_action_with_exception(request)
          response_details = json.loads(response)["Instances"]["Instance"]
          dict_res = { detail.get("InstanceId",""):{"status":detail.get("Status","")} for detail in response_details }
          return dict_res
      
      def run_command(cmdtype,cmdcontent,instance_ids,timeout=60):
          """
          cmdtype: the command type, which can be RunBatScript, RunPowerShellScript, or RunShellScript.
          cmdcontent: the command content.
          instance_ids: the IDs of the instances on which you want to run the command.
          """
          try:
              request = RunCommandRequest()
              request.set_accept_format('json')
      
              request.set_Type(cmdtype)
              request.set_CommandContent(cmdcontent)
              request.set_InstanceIds(instance_ids)
              # The timeout period for running the command. Unit: seconds. Default value: 60. Specify this parameter based on the command that you want to run.
              request.set_Timeout(timeout)
              response = client.do_action_with_exception(request)
              invoke_id = json.loads(response).get("InvokeId")
              return invoke_id
          except Exception as e:
              logger.error("run command failed")
      
      def reboot_instances(instance_ids,Force=False):
          """
          instance_ids: the IDs of the instances that you want to restart.
          Force: specifies whether to forcibly restart the instances. Default value: False.
          """
          request = RebootInstancesRequest()
          request.set_accept_format('json')
          request.set_InstanceIds(instance_ids)
          request.set_ForceReboot(Force)
          response = client.do_action_with_exception(request)
      
      def wait_invoke_finished_get_out(invoke_id,wait_count,wait_interval):
          for i in range(wait_count):
              result = get_invoke_result(invoke_id)
              if set([res["status"] for _,res in result.items()]) & set(["Running","Pending","Stopping"]):
                  time.sleep(wait_interval)
              else:
                  return result
          return result
      
      def wait_instance_reboot_ready(ins_ids,wait_count,wait_interval):
          for i in range(wait_count):
              result = get_instances_status(ins_ids)
              if set([res["status"] for _,res in result.items()]) != set(["Running"]):
                  time.sleep(wait_interval)
              else:
                  return result
          return result
      
      def run_task():
          # Specify the type of the command.
          cmdtype = "RunShellScript"
          # Specify the content of the command.
          cmdcontent = """
          #!/bin/bash
          echo helloworld
          """
          # Specify the timeout period.
          timeout = 60
          # Specify the IDs of the instances on which you want to run the command. After the command is run on these instances, these instances are restarted.
          ins_ids= ["i-bp185fcs****","i-bp14wwh****","i-bp13jbr****"]
      
          # Run the command.
          invoke_id = run_command(cmdtype,cmdcontent,ins_ids,timeout)
          logger.info("run command,invoke-id:%s" % invoke_id)
      
          # Wait for the command to finishing running. Query the command running state 10 times at an interval of 5 seconds. Specify the number of queries and the query interval based on the actual requirements.
          invoke_result = wait_invoke_finished_get_out(invoke_id,10,5)
          for ins_id,res in invoke_result.items():
              logger.info("instance %s command execute finished,status: %s,output:%s" %(ins_id,res["status"],res["output"]))
      
          # Restart the instances.
          logger.warn("reboot instance Now")
          reboot_instances(ins_ids)
      
          time.sleep(5)
          # Wait for the instances to enter the Running state. Query the instance states 30 times at an interval of 10 seconds.
          reboot_result = wait_instance_reboot_ready(ins_ids,30,10)
          logger.warn("reboot instance Finished")
          for ins_id,res in reboot_result.items():
              logger.info("instance %s status: %s" %(ins_id,res["status"]))
      
      if __name__ == '__main__':
          run_task()
    4. Run the .py file.
      The following figure shows a sample result of running the .py file. In this example, a command is run on three instances to obtain helloworld and the three instances are then restarted. openapi-exec-reboot

Use OOS to batch run Cloud Assistant commands and restart instances

OOS is an automated O&M service provided by Alibaba Cloud. You can use OOS templates to customize and execute O&M tasks.

  1. Go to the Create Template page.
    1. Log on to the OOS console.
    2. In the left-side navigation pane, click My Templates.
    3. On the My Templates page, click Create Template.
  2. Configure parameters on the Create Template page.
    1. Enter a template name in the Template Name field. Example: runcommand_reboot_instances.
    2. Click the YAML tab and enter the following code:
      FormatVersion: OOS-2019-06-01
      Description:
        en: Bulky run command on ECS instances and reboot instance.
        name-en: ACS-ECS-BulkyRunCommandRboot
        categories:
          - run_command
      Parameters:
        regionId:
          Type: String
          Description:
            en: The id of region
          Label:
            en: Region
          AssociationProperty: RegionId
          Default: '{{ ACS::RegionId }}'
        targets:
          Type: Json
          Label:
            en: TargetInstance
          AssociationProperty: Targets
          AssociationPropertyMetadata:
            ResourceType: ALIYUN::ECS::Instance
            RegionId: regionId
        commandType:
          Description:
            en: The type of command
          Label:
            en: CommandType
          Type: String
          AllowedValues:
            - RunBatScript
            - RunPowerShellScript
            - RunShellScript
          Default: RunShellScript
        commandContent:
          Description:
            en: Command content to run in ECS instance
          Label:
            en: CommandContent
          Type: String
          MaxLength: 16384
          AssociationProperty: Code
          Default: echo hello
        workingDir:
          Description:
            en: 'The directory where the created command runs on the ECS instances.Linux instances: under the home directory of the administrator (root user): /root.Windows instances: under the directory where the process of the Cloud Assistant client is located, such asC:\Windows\System32.'
          Label:
            en: WorkingDir
          Type: String
          Default: ''
        timeout:
          Description:
            en: The value of the invocation timeout period of a command on ECS instances
          Label:
            en: Timeout
          Type: Number
          Default: 600
        enableParameter:
          Description:
            en: Whether to include secret parameters or custom parameters in the command
          Label:
            en: EnableParameter
          Type: Boolean
          Default: false
        username:
          Description:
            en: The username that is used to run the command on the ECS instance
          Label:
            en: Username
          Type: String
          Default: ''
        windowsPasswordName:
          Description:
            en: The name of the password used to run the command on a Windows instance
          Label:
            en: WindowsPasswordName
          Type: String
          Default: ''
          AssociationProperty: SecretParameterName
        rateControl:
          Description:
            en: Concurrency ratio of task execution
          Label:
            en: RateControl 
          Type: Json
          AssociationProperty: RateControl
          Default:
            Mode: Concurrency
            MaxErrors: 0
            Concurrency: 10
        OOSAssumeRole:
          Description:
            en: The RAM role to be assumed by OOS
          Label:
            en: OOSAssumeRole
          Type: String
          Default: OOSServiceRole
      RamRole: '{{ OOSAssumeRole }}'
      Tasks:
        - Name: getInstance
          Description:
            en: Views the ECS instances.
          Action: ACS::SelectTargets
          Properties:
            ResourceType: ALIYUN::ECS::Instance
            RegionId: '{{ regionId }}'
            Filters:
              - '{{ targets }}'
          Outputs:
            instanceIds:
              Type: List
              ValueSelector: Instances.Instance[].InstanceId
        - Name: runCommand
          Action: ACS::ECS::RunCommand
          Description:
            en: Execute cloud assistant command.
          Properties:
            regionId: '{{ regionId }}'
            commandContent: '{{ commandContent }}'
            instanceId: '{{ ACS::TaskLoopItem }}'
            commandType: '{{ commandType }}'
            workingDir: '{{ workingDir }}'
            timeout: '{{ timeout }}'
            enableParameter: '{{ enableParameter }}'
            username: '{{ username }}'
            windowsPasswordName: '{{ windowsPasswordName }}'
          Loop:
            RateControl: '{{ rateControl }}'
            Items: '{{ getInstance.instanceIds }}'
            Outputs:
              commandOutputs:
                AggregateType: Fn::ListJoin
                AggregateField: commandOutput
          Outputs:
            commandOutput:
              Type: String
              ValueSelector: invocationOutput
        - Name: rebootInstance
          Action: ACS::ECS::RebootInstance
          Description:
            en: Restarts the ECS instances.  
          Properties:
            regionId: '{{ regionId }}'
            instanceId: '{{ ACS::TaskLoopItem }}'
          Loop:
            RateControl: '{{ rateControl }}'
            Items: '{{ getInstance.instanceIds }}'
      Outputs:
        instanceIds:
          Type: List
          Value: '{{ getInstance.instanceIds }}'
    3. Click Create Template.
  3. Execute the template.
    1. Find the created template and click Create Execution in the Actions column.
    2. Configure the execution.
      Complete the execution configurations step by step as instructed. In the Parameter Settings step, select multiple instances and use the default values for other parameters. exec-temp
    3. In the OK step, click Create.
      After the execution is created, you are automatically directed to the Basic Information tab on the execution details page. If the template is executed, Success is displayed in the Execution Status section.
  4. View the execution result.
    You can click the Advanced View tab to view the execution process and result. The following figure shows that the operations specified in the template are performed. exec-result