This topic describes how to monitor Edge Node Service (ENS) events in the CloudMonitor console. You can view the usage status of ENS by monitoring events. When a fault occurs in your service, you can quickly locate and analyze the fault.
All event-based alerts are triggered based on instances. If the same event affects three instances, three alerts are triggered.
ENS system events are managed by using CloudMonitor. For more information, see Overview, Alibaba Cloud service events, and Manage system event-triggered alert rules (old).
Events
Event name | Description | Event status |
Instance reboot | An instance is rebooted due to system errors. | Running and complete |
Instance creation | An instance is created. | Complete |
Instance deletion | An instance is released. | Complete |
Node network cutover | A routine network device maintenance task is performed for an edge node. In most cases, the task is performed at midnight. Network jitter or network interruptions may occur. | Planned, running, and complete |
Node network | An unexpected network interruption occurs on an edge node. | Disconnected and recovered |
Event level | Description |
CRITICAL | Critical |
WARN | Warning |
INFO | Information |
Event type | Description |
Executing | The event is occurring or occurred. |
Executed | The event is complete or recovered. |
Scheduled | The event is planned. In most cases, notifications are sent in advance. |
Canceled | The event is canceled. |
Instance reboot (running)
Item | Description |
Event description | An instance is being rebooted due to system errors. |
Event name | Instance:SystemFailure.Reboot:Executing |
Event level | CRITICAL |
Event type | Executing |
Status | Executing |
Notification data | After you call the required API operation, a JSON string is returned in the callback. |
The following sample code provides an example of the fields.
{
"reason":"nc_network_error",// The reason for the reboot.
"errorTime":"2019-04-17 20:20:50",// The time when the event occurred.
"resumeTime":"",// The time when the instance was recovered.
"internetIP":"117.34.xx.xx",// The IP address of the instance.
"instanceId":"i-5hl5z85eo4eyj96zbls8****",// The ID of the instance.
"level":"CRITICAL",// The level of the event.
"regionId":"cn-xian-telecom",// The ID of the node.
"regionName":"China Telecom (Xi'an)",// The name of the node.
"eventName":"Instance:SystemFailure.Reboot:Executing",// The name of the event.
"status":"Executing",// The status of the event.
"timestamp":1555503650000// The timestamp when the event was reported.
}
Instance reboot (complete)
Item | Description |
Event description | An instance is rebooted due to system errors. |
Event name | Instance:SystemFailure.Reboot:Executed |
Event level | INFO |
Event type | Executed |
Status | Executed |
Notification data | After you call the required API operation, a JSON string is returned in the callback. |
The following sample code provides an example of the fields.
{
"reason":"nc_network_error",// The reason for the reboot.
"errorTime":"2019-04-17 20:20:50",// The time when the event occurred.
"resumeTime":"2019-04-17 20:22:49",// The time when the instance was recovered.
"internetIP":"117.34.xx.xx",// The IP address of the instance.
"instanceId":"i-5hl5z85eo4eyj96zbls****",// The ID of the instance.
"level":"INFO",// The level of the event.
"regionId":"cn-xian-telecom",// The ID of the node.
"regionName":"China Telecom (Xi'an)",// The name of the node.
"eventName":"Instance:SystemFailure.Reboot:Executed",// The name of the event.
"status":"Executed",// The status of the event.
"timestamp":1555503650000// The timestamp when the event was reported.
}
Instance creation (complete)
Item | Description |
Event description | An instance is created. |
Event name | EnsInstance:Create:Executed |
Event level | CRITICAL |
Event type | Executed |
Status | Executed |
Notification data | After you call the required API operation, a JSON string is returned in the callback. |
The following sample code provides an example of the fields.
{
"regionId": "cn-xian-telecom", // The ID of the node.
"level": "CRITICAL", // The level of the event.
"instances": // The list of instances that are created on the same node within a short period of time.
[
{
"instanceId": "i-5it52o4t259piz1u6ef****", // The ID of the instance.
"internetIp": [ "117.27.xx.xx" ], // The IP address of the instance.
"operateTime": "2020-04-08 20:06:35" // The time when the instance was created.
}
],
"regionName": "China Telecom (Xi'an)", // The name of the node.
"eventName": "EnsInstance:Create:Executed", // The name of the event.
"status": "Executed", // The status of the event.
"timestamp": 1586347660000 // The timestamp, in milliseconds.
}
Instance deletion (complete)
Item | Description |
Event description | An instance is deleted. |
Event name | EnsInstance:Delete:Executed |
Event level | CRITICAL |
Event type | Executed |
Status | Executed |
Notification data | After you call the required API operation, a JSON string is returned in the callback. |
The following sample code provides an example of the fields.
{
"regionId": "cn-xian-telecom", // The ID of the node.
"level": "CRITICAL", // The level of the event.
"instances": // The list of instances that are deleted from the same node within a short period of time.
[
{
"instanceId": "i-5it52o4t259piz1u6ef5****", // The ID of the instance.
"internetIp": [ "117.27.xx.xx" ], // The IP address of the instance.
"operateTime": "2020-04-08 20:06:35" // The time when the instance was deleted.
}
],
"regionName": "China Telecom (Xi'an)", // The name of the node.
"eventName": "EnsInstance:Create:Executed", // The name of the event.
"status": "Executed", // The status of the event.
"timestamp": 1586347660000 // The timestamp, in milliseconds.
}
Node network cutover (planned)
In most cases, notifications about a network cutover plan are sent more than 24 hours in advance. In case of an emergency, the notifications are sent less than 24 hours in advance.
Item | Description |
Event description | A network cutover is planned for an edge node. |
Event name | EnsRegion:NetworkMigration:Scheduled |
Event level | WARN |
Event type | Scheduled |
Status | Scheduled |
Notification data | After you call the required API operation, a JSON string is returned in the callback. |
The following sample code provides an example of the fields.
{
"networkMigrationEventId":-50,// The ID of the network cutover event. If three instances are affected, the same event ID applies to the instances.
"instanceId":"i-5hlabsavg39f5hlnkk2f3928z",// The ID of the affected instance.
"internetIp":"117.34.xx.xx",// The IP address of the affected instance.
"regionId":"cn-xian-telecom",// The ID of the node.
"level":"WARN",// The level of the event.
"regionName":"China Telecom (Xi'an)",// The name of the node.
"startTime":1555588800000,// The time when the cutover is planned to start.
"endTime":1555592400000,// The time when the cutover is planned to end.
"aliUid":"108131418885****",// The ID of the user.
"event":"EnsRegion:NetworkMigration:Scheduled",// The name of the event.
"status":"Scheduled"// The status of the event.
}
Node network cutover (running)
Notifications are sent when a network cutover starts. In most cases, the event is reported a few minutes (0 to 5 minutes) in advance.
Item | Description |
Event description | A network cutover for an edge node starts. |
Event name | EnsRegion:NetworkMigration:Executing |
Event level | CRITICAL |
Event type | Executing |
Status | Executing |
Notification data | After you call the required API operation, a JSON string is returned in the callback. |
The following sample code provides an example of the fields.
{
"networkMigrationEventId":-50,// The ID of the network cutover event. If three instances are affected, the same event ID applies to the instances.
"instanceId":"i-5hlabsavg39f5hlnkk2f****",// The ID of the affected instance.
"internetIp":"117.34.xx.xx",// The IP address of the affected instance.
"regionId":"cn-xian-telecom",// The ID of the node.
"level":"CRITICAL",// The level of the event.
"regionName":"China Telecom (Xi'an)",// The name of the node.
"startTime":1555588800000,// The time when the cutover is planned to start.
"endTime":1555592400000,// The time when the cutover is planned to end.
"aliUid":"108131418885****",// The ID of the user.
"event":"EnsRegion:NetworkMigration:Executing",// The name of the event.
"status":"Executing"// The status of the event.
}
Node network cutover (complete)
The event is triggered when a network cutover is complete. The Internet service provider (ISP) does not send notifications when the network cutover is complete. To obtain the end time, check the time when the network cutover is planned to end.
Item | Description |
Event description | A network cutover for an edge node is complete. |
Event name | EnsRegion:NetworkMigration:Executed |
Event level | INFO |
Event type | Executed |
Status | Executed |
Notification data | After you call the required API operation, a JSON string is returned in the callback. |
The following sample code provides an example of the fields.
{
"networkMigrationEventId":-50,// The ID of the network cutover event. If three instances are affected, the same event ID applies to the instances.
"instanceId":"i-5hlabsavg39f5hlnkk2****",// The ID of the affected instance.
"internetIp":"117.34.xx.xx",// The IP address of the affected instance.
"regionId":"cn-xian-telecom",// The ID of the node.
"level":"INFO",// The level of the event.
"regionName":"China Telecom (Xi'an)",// The name of the node.
"startTime":1555588800000,// The time when the cutover is planned to start.
"endTime":1555592400000,// The time when the cutover is planned to end.
"aliUid":"108131418885****",// The ID of the user.
"event":"EnsRegion:NetworkMigration:Executed",// The name of the event.
"status":"Executed"// The status of the event.
}
Node network (disconnected)
When the network detection program of the ENS system finds that a node is disconnected from the network, the node network disconnection event is triggered.
Item | Description |
Event description | An edge node is disconnected from the network. |
Event name | EnsRegion:NetworkDown:Executing |
Event level | CRITICAL |
Event type | Executing |
Status | Executing |
Notification data | After you call the required API operation, a JSON string is returned in the callback. |
The following sample code provides an example of the fields.
{
"reason":"rg_network_down", // The reason for the node network disconnection event. The value is unique.
"errorTime":"2019-04-19 16:48:12",// The time when the event occurred.
"resumeTime":"",// The time when the instance was recovered.
"internetIP":"117.34.xx.xx",// The IP address of the instance.
"instanceId":"i-5hlabsavg39f5hlnk****",// The ID of the instance.
"level":"CRITICAL",// The level of the event.
"regionId":"cn-xian-telecom",// The ID of the node.
"regionName":"China Telecom (Xi'an)",// The name of the node.
"eventName":"EnsRegion:NetworkDown:Executing",// The name of the event.
"status":"Executing",// The status of the event.
"timestamp":1555663692000// The timestamp.
}
Node network (recovered)
When the network detection program of the ENS system finds that one or more instances of a node that is disconnected from the network recover, the node network recovery event is triggered.
Item | Description |
Event description | The network of an edge node is recovered. |
Event name | EnsRegion:NetworkDown:Executed |
Event level | CRITICAL |
Event type | Executed |
Status | Executed |
Notification data | After you call the required API operation, a JSON string is returned in the callback. |
The following sample code provides an example of the fields.
{
"reason":"rg_network_down",// The reason for the node network disconnection event. The value is unique.
"errorTime":"2019-04-19 16:48:12",// The time when the event occurred.
"resumeTime":"2019-04-19 16:52:01",// The time when the network was recovered.
"internetIP":"117.34.xx.xx",// The IP address of the instance.
"instanceId":"i-5hlabsavg39f5hlnkk2f****",// The ID of the instance.
"level":"INFO",// The level of the event.
"regionId":"cn-xian-telecom",// The ID of the node.
"regionName":"China Telecom (Xi'an)",// The name of the node.
"eventName":"EnsRegion:NetworkDown:Executed",// The name of the event.
"status":"Executed",// The status of the event.
"timestamp":1555663921000// The timestamp.
}
Node network usage level (exception)
When the network usage level of an edge node is excessively high, the usage level exception event is triggered.
Item | Description |
Event description | The network usage level of an edge node is abnormal. |
Event name | EnsRegion:NetworkWaterLevel:Executing |
Event level | WARN |
Event type | Executing |
Status | Executing |
Notification data | After you call the required API operation, a JSON string is returned in the callback. |
The following sample code provides an example of the fields.
{
"reason":"The network usage level of the node is excessively high.",
"level":"WARN", // The level of the event.
"instances":[
{
"instanceId":"i-xxxxxxxxxxxxxxxxxxxxxxxxx",
"instanceIp":"14.xx.xx.xx"
},
{
"instanceId":"i-xxxxxxxxxxxxxxxxxxxxxxxxx",
"instanceIp":"14.xx.xx.x"
}
], // The list of affected instances.
"regionName":"China Unicom (Kunming)",
"networkWaterLevelEventId":12345,
"regionId":"cn-kunming-unicom",
"startTimeFmt":"2020-07-13 15:30:00",
"eventName":"EnsRegion:NetworkWaterLevel:Executing",
"startTime":1594625400, // The time when the issue started.
"endTime":0, // The time when the issue ended. For the EnsRegion:NetworkWaterLevel:Executing event, the value is 0.
"endTimeFmt":"1970-01-01 08:00:00",
"timestamp":1594625489000,
"status":"Executing"
}
Node network usage level (recovered)
When the network usage level of an edge node is recovered, the usage level recovery event is triggered.
Item | Description |
Event description | The network usage level of an edge node is recovered. |
Event name | EnsRegion:NetworkWaterLevel:Executed |
Event level | WARN |
Event type | Executed |
Status | Executed |
Notification data | After you call the required API operation, a JSON string is returned in the callback. |
The following sample code provides an example of the fields.
{
"reason":"The network usage level of the node is excessively high.",
"level":"WARN", // The level of the event.
"instances":[
{
"instanceId":"i-xxxxxxxxxxxxxxxxxxxxxxxxx",
"instanceIp":"14.xx.xx.xx"
},
{
"instanceId":"i-xxxxxxxxxxxxxxxxxxxxxxxxx",
"instanceIp":"14.xx.xx.x"
}
], // The list of affected instances.
"regionName":"China Unicom (Kunming)",
"networkWaterLevelEventId":12345,
"regionId":"cn-kunming-unicom",
"startTimeFmt":"2020-07-13 15:30:00",
"eventName":"EnsRegion:NetworkWaterLevel:Executed",
"startTime":1594625400, // The time when the issue started.
"endTime":1594625700, // The time when the issue ended.
"endTimeFmt":"2020-07-13 15:35:00",
"timestamp":1594625648000,
"status":"Executed"
}
Query events
Log on to the CloudMonitor console.
In the left-side navigation pane, choose Event Center > System Event. The System Event page appears.
On the Event Monitoring tab, select Edge Node Service from the product drop-down list, select an event from the event drop-down list, specify the time range, and then click Search.
In the event list, find the event that you want to query and click Details in the Actions column to view the details of the event.