The Overview page provides monitoring information about Key Management Service (KMS) instances. The information includes instance specifications, status, and metrics. You can also configure alert rules to monitor the metrics. This topic describes how to view the monitoring information about a KMS instance and configure CloudMonitor alerts.
Introduction
If "The version of the instance is outdated. To view all metrics, submit a ticket to confirm the upgrade window." is displayed on the Overview page, contact technical support and upgrade the instance version.
KMS is integrated with CloudMonitor. On the Overview page, you can view the trend charts of metrics. For more information, see What is CloudMonitor?
You can configure CloudMonitor alerts based on your business requirements to identify and resolve issues in advance. Common alert rule settings:
The average number of requests per second reaches 80% of the threshold. For more information about performance data, see Performance quotas.
For example, you purchase a KMS instance of the software key management type whose computing performance is 1,000 queries per second (QPS). You can configure an alert rule to trigger an alert when the total number of requests per minute for the instance reaches 48,000 (1000 QPS × 60 seconds × 80%) for 3 consecutive minutes. The alert indicates that the average metric value exceeds 80% of the instance performance. In this case, we recommend that you upgrade the instance to enhance the performance.
HTTP status code 4XX or HTTP status code 5XX is returned for three consecutive cycles.
HTTP status code 4XX indicates that the request is invalid or the specified resource does not exist. You can troubleshoot this error based on the error message. HTTP status code 5XX indicates that the service is unavailable. You can try again later or contact technical support.
Precaution
The AliyunCloudMonitorReadOnlyAccess permission is granted to the Resource Access Management (RAM) user that you use. To grant the permission, log on to the RAM console. For more information, see Grant permissions to a RAM user.
View the overview and monitoring data of a KMS instance
Log on to the KMS console. In the top navigation bar, select the required region. In the left-side navigation pane, click Overview.
Select the ID of the instance that you want to view from the Instance ID drop-down list and view the overview and monitoring data of the KMS instance.
NoteYou can view the data of metrics in the previous 30 days.
Optional. Turn on Auto Refresh. If you turn on the switch, KMS automatically refreshes the monitoring data every minute.
Configure CloudMonitor alerts
Log on to the KMS console. In the top navigation bar, select the required region. In the left-side navigation pane, click Overview.
On the Overview page, click Configure Alert Rules to go to the CloudMonitor console.
Creates an alert contact and an alert contact group. For more information, see Create an alert contact or alert contact group.
Create one or more alert rules.
In the left-side navigation pane, choose
.On the Alert Rules page, click Create Alert Rule. In the Create Alert Rule panel, configure the parameters and click Confirm.
Parameter
Description
Product
The service for which you want to create the alert rule. Select Key Management Service.
Resource Range
The range of the resources to which the alert rule applies. Valid values:
All Resources: The alert rule applies to all resources of the specified cloud service.
Application Groups: The alert rule applies to all resources in the specified application group of the specified cloud service.
Instances: The alert rule applies to the specified resources of the specified cloud service.
Rule Description
The content of the alert rule. The parameters in this section specify the conditions that trigger an alert. To specify the rule description, perform the following steps:
Click Add Rule.
In the Config Rule Description panel, enter a rule name in Alert Rule and then set rule conditions.
Single Metric: Select a metric and then set the threshold and alert level.
Multiple Metrics: Select an alert level and then set alert conditions for two or more metrics.
Dynamic Threshold: For more information about dynamic thresholds, see Overview of dynamic threshold-based alert rules and Create dynamic threshold-based alert rules.
NoteThe dynamic threshold feature is in invitational preview. You must submit a ticket to apply for using the feature.
You can create dynamic threshold-based alert rules only if you set Resource Range to Instances.
Click Confirm.
NoteFor information about how to specify complex alert conditions, see Alert rule expressions.
Mute For
The interval at which CloudMonitor resends alert notifications before the alert is cleared. Valid values: 5 Minutes, 15 Minutes, 30 Minutes, 60 Minutes, 3 Hours, 6 Hours, 12 Hours, and 24 Hours.
If a metric value reaches the threshold, CloudMonitor sends an alert notification. If the metric value reaches the threshold again within the mute period, CloudMonitor does not resend an alert notification. If the alert is not cleared after the mute period ends, CloudMonitor resends an alert notification.
For example, if the Mute For parameter is set to 12 Hours and the alert is not cleared, Cloud Monitor resends an alert notification after 12 hours.
Effective Period
The period during which the alert rule is effective. CloudMonitor sends alert notifications based on the alert rule only within the effective period.
NoteIf an alert rule is not effective, no alert notification is sent. However, the alert history is still displayed on the Alert History page.
Alert Contact Group
The alert contact group to which alert notifications are sent.
Tag
The tags of the alert rule. A tag consists of a key and a value.
NoteYou can set a maximum of six tags.
Alert Callback
The callback URL that can be accessed over the Internet. CloudMonitor sends HTTP POST requests to push alert notifications to the specified URL. You can enter only an HTTP URL. For more information about how to configure alert callback, see Use the alert callback feature to send notifications about threshold-triggered alerts.
To test the connectivity of an alert callback URL, perform the following steps:
Click Test next to the callback URL.
In the Webhook Test panel, you can check and troubleshoot the connectivity of the alert callback URL based on the returned status code and test result details.
NoteTo obtain the details of the test result, configure the Test Template Type and Language parameters and click Test.
Click Close.
NoteYou can click Advanced Settings to configure this parameter.
Auto Scaling
If you turn on Auto Scaling, the specified scaling rule is enabled when an alert is triggered. In this case, you must configure the Region, ESS Group, and ESS Rule parameters.
For information about how to create a scaling group, see Manage scaling groups.
For information about how to create a scaling rule, see Manage scaling rules.
NoteYou can click Advanced Settings to configure this parameter.
Log Service
If you turn on Log Service, the alert information is sent to the specified Logstore when an alert is triggered. In this case, you must configure the Region, ProjectName, and Logstore parameters.
For information about how to create a project and a Logstore, see Getting Started.
NoteYou can click Advanced Settings to configure this parameter.
Message Service - topic
If you turn on Message Service - Topic, the alert information is sent to the specified topic in Message Service (MNS) when an alert is triggered. In this case, you must configure the Region and topicName parameters.
For information about how to create a topic, see Create a topic.
NoteYou can click Advanced Settings to configure this parameter.
Function Compute
If you turn on Function Compute, an alert notification is sent to Function Compute for format processing when an alert is triggered. In this case, you must configure the Region, Service, and Function parameters.
For more information about how to create a service and a function, see Create a function in the Function Compute console.
NoteYou can click Advanced Settings to configure this parameter.
Method to handle alerts when no monitoring data is found
The method that is used to handle alerts when no monitoring data is found. Valid values:
Do not do anything (default)
Send alert notifications
Treated as normal
NoteYou can click Advanced Settings to configure this parameter.
Supported CloudMonitor metrics
Metric | Description | Alerting Supported (Yes/No) | Aggregation Dimension | Statistical Method |
request_total_1m | The total number of requests per minute. | Yes | userId, regionId, and instanceId | Value |
request_symmetric_1m | The number of encryption and decryption requests per minute by using symmetric keys. | Yes | userId, regionId, and instanceId | Value |
request_asymmetric_encrypt_1m | The number of encryption requests per minute by using asymmetric keys. | Yes | userId, regionId, and instanceId | Value |
request_asymmetric_decrypt_1m | The number of decryption requests per minute by using asymmetric keys. | Yes | userId, regionId, and instanceId | Value |
request_asymmetric_sign_1m | The number of signing requests per minute by using asymmetric keys. | Yes | userId, regionId, and instanceId | Value |
request_asymmetric_verify_1m | The number of signature verification requests per minute by using asymmetric keys. | Yes | userId, regionId, and instanceId | Value |
request_secret_1m | The number of secret requests per minute. | Yes | userId, regionId, and instanceId | Value |
request_other_1m | The number of requests for other operations. | Yes | userId, regionId, and instanceId | Value |
code_5xx_1m | The number of requests for which HTTP status code 5XX is returned per minute. | Yes | userId, regionId, and instanceId | Value |
code_4xx_1m | The number of requests for which HTTP status code 4XX is returned per minute. | Yes | userId, regionId, and instanceId | Value |
latency_1m | The average latency of all requests per minute. | Yes | userId, regionId, and instanceId | Value |