All Products
Search
Document Center

CloudMonitoringTriggerFunctionCalculation

Last Updated: Sep 16, 2019

CloudMonitor

CloudMonitor provides end-to-end, out-of-the-box, and enterprise-class monitoring solutions for cloud users. CloudMonitor is able to monitor IT infrastructure, external network quality, events, custom metrics, and service logs, and provides you with efficient, comprehensive, and cost-effective monitoring services.

CloudMonitor provides the cloud service events monitoring feature, and more events are being added to this feature. Custom processing of cloud resources can be automatically performed when multiple events trigger custom functions.

Scenario

If an ECS instance is restarted due to system errors, you may need to verify the problem or create snapshots. In this example, you can trigger a function through CloudMonitor to automatically process the instance that is restarted due to system or instance errors. For example, snapshots are automatically created after the instance is restarted.

Preparation

ECS system eventsecs

Cloud service events monitoringyjk

Function code

The following code indicates that a restart event of an ECS instance triggers the function. The function locates the cloud disks that are attached to the instance and creates snapshots for the cloud disks.

  1. # -*- coding: utf-8 -*-
  2. import logging
  3. import json, random, string, time
  4. from aliyunsdkcore import client
  5. from aliyunsdkecs.request.v20140526.DeleteSnapshotRequest import DeleteSnapshotRequest
  6. from aliyunsdkecs.request.v20140526.CreateSnapshotRequest import CreateSnapshotRequest
  7. from aliyunsdkecs.request.v20140526.DescribeDisksRequest import DescribeDisksRequest
  8. from aliyunsdkcore.auth.credentials import StsTokenCredential
  9. LOGGER = logging.getLogger()
  10. clt = None
  11. def handler(event, context):
  12. creds = context.credentials
  13. sts_token_credential = StsTokenCredential(creds.access_key_id, creds.access_key_secret, creds.security_token)
  14. '''
  15. {
  16. "product": "ECS",
  17. "content": {
  18. "executeFinishTime": "2018-06-08T01:25:37Z",
  19. "executeStartTime": "2018-06-08T01:23:37Z",
  20. "ecsInstanceName": "timewarp",
  21. "eventId": "e-t4nhcpqcu8fqushpn3mm",
  22. "eventType": "InstanceFailure.Reboot",
  23. "ecsInstanceId": "i-bp18l0uopocfc98xxxx"
  24. },
  25. "resourceId": "acs:ecs:cn-hangzhou:12345678:instance/i-bp18l0uopocfc98xxxx",
  26. "level": "CRITICAL",
  27. "instanceName": "instanceName",
  28. "status": "Executing",
  29. "name": "Instance:SystemFailure.Reboot:Executing",
  30. "regionId": "cn-hangzhou"
  31. }
  32. '''
  33. evt = json.loads(event)
  34. content = evt.get("content");
  35. ecsInstanceId = content.get("ecsInstanceId");
  36. regionId = evt.get("regionId");
  37. global clt
  38. clt = client.AcsClient(region_id=regionId, credential=sts_token_credential)
  39. name = evt.get("name");
  40. name = name.lower()
  41. if name in [ 'Instance:SystemFailure.Reboot:Executing'.lower(), "Instance:InstanceFailure.Reboot:Executing".lower()]:
  42. pass
  43. # do other things
  44. if name in ['Instance:SystemFailure.Reboot:Executed'.lower(), "Instance:InstanceFailure.Reboot:Executed".lower()]:
  45. request = DescribeDisksRequest()
  46. request.add_query_param("RegionId", "cn-shenzhen")
  47. request.set_InstanceId(ecsInstanceId)
  48. response = _send_request(request)
  49. disks = response.get('Disks').get('Disk', [])
  50. for disk in disks:
  51. diskId = disk["DiskId"]
  52. SnapshotId = create_ecs_snap_by_id(diskId)
  53. LOGGER.info("Create ecs snap sucess, ecs id = %s , disk id = %s ", ecsInstanceId, diskId)
  54. def create_ecs_snap_by_id(disk_id):
  55. LOGGER.info("Create ecs snap, disk id is %s ", disk_id)
  56. request = CreateSnapshotRequest()
  57. request.set_DiskId(disk_id)
  58. request.set_SnapshotName("reboot_" + ''.join(random.choice(string.ascii_lowercase) for _ in range(6)))
  59. response = _send_request(request)
  60. return response.get("SnapshotId")
  61. # send open api request
  62. def _send_request(request):
  63. request.set_accept_format('json')
  64. try:
  65. response_str = clt.do_action_with_exception(request)
  66. LOGGER.info(response_str)
  67. response_detail = json.loads(response_str)
  68. return response_detail
  69. except Exception as e:
  70. LOGGER.error(e)

Procedure

Note: You must grant the operating permissions on ECS instances to the service role.

image

  • Log on to the CloudMonitor console. In the left-side navigation pane, click Event Monitoring, and click the Alarm Rules tab. On the Alarm Rules tab, click Create Event Alert in the upper-right corner. Configure the parameters on the Create/Modify Event Alert page as shown in the following figure.image image fc

  • Simulate debugging.image image

  • Simulate an ECS event.For more information, see Simulating the process of handling system events.