ApsaraMQ for Kafka emits system events to track instance changes such as upgrades, configuration updates, status transitions, and threat alerts. CloudMonitor captures these events and sends alert notifications when a configured alert rule is triggered.
Prerequisites
Before you begin, make sure that you have:
An ApsaraMQ for Kafka instance
Event notifications configured for ApsaraMQ for Kafka. For details, see Manage event subscriptions (Recommended)
Supported events
ApsaraMQ for Kafka supports six system events, grouped by severity.
Warning events
| Event name | Description | Event type | Status |
|---|---|---|---|
| Instance:Risk | Instance threat alert | Exception | Exception |
| Instance:Upgrade:Version:Notify | Instance version upgrade notification | Notification | Normal |
Informational events
| Event name | Description | Event type | Status |
|---|---|---|---|
| Instance:Mutation | Instance upgrade or downgrade | Notification | Normal |
| Instance:State:Change | Instance status change notification | StatusNotification | Normal |
| Instance:Upgrade:Config | Service configuration upgrade | Notification | Normal |
| Instance:Upgrade:Version | Version upgrade | Notification | Normal |
Notification format
After you configure event notifications, each event is delivered as a JSON payload through your specified notification channel.
If your notification channel transforms the format, the actual output may differ from the examples below.
The following example shows a notification for the Instance:Upgrade:Version:Notify event:
{
"ver": "1.0",
"id": "8582C4CB-1869-41AA-8091-73********C3",
"requestId": "41F12ADB-370B-4D7E-B74D-**********1D",
"eventTime": "1742975149000",
"product": "kafka",
"resourceId": "alikafka_post-cn-xxxx",
"level": " WARN",
"instanceName": "alikafka_post-cn-xxxx",
"regionId": "cn-hangzhou",
"name": "Instance:Upgrade:Version:Notify",
"userId": "169070********30",
"status": "Normal",
"content": {
"version": "v2"
}
}Top-level fields
Every event notification contains the following fields:
| Field | Type | Description | Example |
|---|---|---|---|
ver | String | Schema version. | 1.0 |
id | String | Unique event ID. | 8582C4CB-1869-41AA-8091-73********C3 |
requestId | String | Request ID. | 41F12ADB-370B-4D7E-B74D-**********1D |
eventTime | String | Timestamp in epoch milliseconds when the event occurred. | 1742975149000 |
product | String | Product identifier. Always kafka for ApsaraMQ for Kafka. | kafka |
resourceId | String | Resource ID of the affected instance. | alikafka_post-cn-xxxx |
level | String | Event severity. Valid values: INFO (Information), WARN (Warning), CRITICAL (Critical). | WARN |
instanceName | String | Instance name. | alikafka_post-cn-xxxx |
regionId | String | Alibaba Cloud region ID. | cn-hangzhou |
name | String | Event name, matching the values in the Supported events tables. | Instance:Upgrade:Version:Notify |
userId | String | Alibaba Cloud account ID. | 169070********30 |
status | String | Event status. | Normal |
content | Object | Event-specific details. Structure varies by event type. See Event details. | -- |
Event details
Each event type includes a content object with event-specific fields. The following sections describe each event, its payload, and field definitions.
Instance:Mutation
Fires when an instance is upgraded or downgraded. The payload shows which properties changed and their old and new values.
Event level: INFO
Event type: Notification
Example payload:
{
"ver": "1.0",
"status": "Normal",
"instanceName": "alikafka_post-cn-xxxx",
"resourceId": "acs:alikafka:eu-west-1:175923598xxxxxxx:instance/alikafka_post-cn-xxxx",
"content": {
"data": {
"newPropertyValues": {
"ioMaxSpec": "alikafka.hw.3xlarge"
},
"oldPropertyValues": {
"ioMaxSpec": "alikafka.hw.2xlarge"
}
},
"eventType": "Notification",
"instanceId": "alikafka_post-cn-xxxx",
"instanceName": "alikafka_post-cn-xxxx",
"uploadTime": 1758039249760
},
"product": "kafka",
"time": 1758039249000,
"level": "INFO",
"regionId": "eu-west-1",
"id": "e6245875-c9f0-44bf-b752-**********2",
"groupId": "0",
"name": "Instance:Mutation"
}content fields:
| Field | Type | Description | Example |
|---|---|---|---|
data | Object | Contains the property changes. | -- |
data.newPropertyValues | Object | Property values after the change. | -- |
data.newPropertyValues.ioMaxSpec | String | New traffic specification. | alikafka.hw.3xlarge |
data.oldPropertyValues | Object | Property values before the change. | -- |
data.oldPropertyValues.ioMaxSpec | String | Previous traffic specification. | alikafka.hw.2xlarge |
eventType | String | Event type. | Notification |
instanceId | String | Instance ID. | alikafka_post-cn-xxxx |
instanceName | String | Instance name. | alikafka_post-cn-xxxx |
uploadTime | Number | Upload timestamp in epoch milliseconds. | 1758039249760 |
Instance:Risk
Fires when a threat is detected on the instance. Review the riskData array to identify the specific threat and take the recommended action from reportTips.
Event level: WARN
Event type: Exception
Example payload:
{
"ver": "1.0",
"status": "Exception",
"instanceName": "alikafka_post-cn-xxxxx",
"resourceId": "acs:alikafka:cn-hangzhou:10536xxxxxxxxx:instance/alikafka_post-cn-xxxx",
"content": {
"eventType": "Exception",
"instanceId": "alikafka_post-cn-xxxxx",
"instanceName": "alikafka_post-cn-xxxxx",
"riskData": [
{
"accessLevel": 7,
"gradeType": "F",
"health": false,
"levelType": 0,
"name": "publicTcpConnection",
"originLevelType": 1,
"relationList": [],
"reportTips": "Too many Internet connections can affect cluster stability. Optimize the connection method.",
"reportType": "mdsKey",
"reportValue": "",
"value": "298.0"
}
],
"uploadTime": 1757913918071
},
"product": "kafka",
"time": 1757913918000,
"level": "WARN",
"regionId": "cn-hangzhou",
"id": "68f03e98-e436-457b-adda-**********6",
"groupId": "0",
"name": "Instance:Risk"
}content fields:
| Field | Type | Description | Example |
|---|---|---|---|
eventType | String | Event type. | Exception |
instanceId | String | Instance ID. | alikafka_post-cn-xxxxx |
instanceName | String | Instance name. | alikafka_post-cn-xxxxx |
riskData | Array | List of detected threats. Each element contains the fields below. | -- |
uploadTime | Number | Upload timestamp in epoch milliseconds. | 1757913918071 |
riskData element fields:
| Field | Type | Description | Example |
|---|---|---|---|
accessLevel | Number | Access level. | 7 |
gradeType | String | Metric rating. A: Healthy. B: Sub-healthy. F: Poor. | F |
health | Boolean | Whether the instance is healthy. true: Healthy. false: Unhealthy. | false |
levelType | Number | Risk level. 0: Urgent. 1: Important. 2: General. | 0 |
name | String | Threat type. See Threat types. | publicTcpConnection |
originLevelType | Number | Source level type. | 1 |
relationList | Array | Related resources. The system may nest additional report data based on outer fields. | [] |
reportTips | String | Recommended fix for the detected threat. | Too many Internet connections can affect cluster stability. Optimize the connection method. |
reportType | String | Report type. See Report types. | mdsKey |
reportValue | String | Report value. Interpretation depends on reportType. See Report types. | "" |
value | String | System-calculated value. For publicTcpConnection, this is the number of Internet TCP connections on a single node. Interpretation depends on reportType. See Report types. | 298.0 |
Threat types
The name field in riskData identifies the specific threat:
| Value | Description |
|---|---|
topic | A threat exists for a specific topic. |
group | A threat exists for a specific Group. |
offsetCommitTimes | The consumer client commits consumer offsets too frequently. |
multiAssignGroup | The same partition is assigned to multiple consumer threads. |
partitionLeft | The number of partitions is insufficient. |
topicLeft | The number of topics is insufficient. |
diskUsage | The disk usage. |
outputIo | Read traffic exceeds the limit. |
inputIo | Write traffic exceeds the limit. |
diskLean | Disk skew exists in the cluster. |
topicLean | A risk of partition skew exists for the topic. |
singlePartitionTopic | A topic with a single partition exists in cloud storage. |
chipProduce | Messages are excessively fragmented. |
syncProduce | The topic uses synchronous sending. |
conversionProduce | The topic format is transformed. |
publicTcpConnection | The number of Internet connections is excessive. |
tcpConnection | The number of connections is excessive. |
version | The minor version is too low. |
groupLeft | The Group quota is insufficient. |
sendTimeGroup | A Group has high consumption latency. |
leaveGroup | A consumer client in a Group actively leaves the queue and triggers rebalancing. |
rebalanceGroup | Rebalancing occurs in the related Group. |
saramaClient | A Sarama client is used. |
groupTopicMap | A Group subscribes to too many topics. |
Report types
The reportType field determines how to interpret reportValue and value:
reportType | How to respond |
|---|---|
topic | Check reportValue for the topic that needs to be fixed. The value field also returns the topic name. |
group | Check reportValue for the Group that needs to be fixed. The value field also returns the Group name. |
doc | reportValue contains a document path. To access the document, replace ${reportValue} in this URL: https://www.alibabacloud.com/help/document_detail/${reportValue}.htm. Check relationList and value -- the value field returns the number of topics or Groups in relationList that require optimization. |
commonBuy | An upgrade or similar operation is required. The value field returns a percentage. |
mdsKey | Follow the recommendation in reportTips to fix the issue. |
Instance:State:Change
Fires when an instance transitions to a new status, such as Stopped.
Event level: INFO
Event type: StatusNotification
Example payload:
{
"ver": "1.0",
"status": "Normal",
"instanceName": "test",
"resourceId": "acs:alikafka:cn-hangzhou:105369xxxxxxxx:instance/alikafka_post-cn-xxxx",
"content": {
"data": "Stopped",
"eventType": "StatusNotification",
"instanceId": "alikafka_post-cn-xxxx",
"instanceName": "test",
"uploadTime": 1757913992395
},
"product": "kafka",
"time": 1757913992000,
"level": "INFO",
"regionId": "cn-hangzhou",
"id": "f96d1ab8-ee84-435b-b866-**********8",
"groupId": "0",
"name": "Instance:State:Change"
}content fields:
| Field | Type | Description | Example |
|---|---|---|---|
data | String | New instance status. | Stopped |
eventType | String | Event type. | StatusNotification |
instanceId | String | Instance ID. | alikafka_post-cn-xxxx |
instanceName | String | Instance name. | test |
uploadTime | Number | Upload timestamp in epoch milliseconds. | 1757913992395 |
Instance:Upgrade:Config
Fires when a service configuration property is updated. The payload shows the old and new values.
Event level: INFO
Event type: Notification
Example payload:
{
"ver": "1.0",
"status": "Normal",
"instanceName": "test",
"resourceId": "acs:alikafka:cn-zhangjiakou:1759235xxxxxxx:instance/alikafka_serverless-cn-xxxxxx",
"content": {
"data": {
"newPropertyValues": {
"offsets.retention.minutes": "10078"
},
"oldPropertyValues": {
"offsets.retention.minutes": "10079"
}
},
"eventType": "Notification",
"instanceId": "alikafka_serverless-cn-xxxxxx",
"instanceName": "test",
"uploadTime": 1758100169280
},
"product": "kafka",
"time": 1758100169000,
"level": "INFO",
"regionId": "cn-zhangjiakou",
"id": "3a1d3148-0919-463d-9841-**********3",
"groupId": "0",
"name": "Instance:Upgrade:Config"
}content fields:
| Field | Type | Description | Example |
|---|---|---|---|
data | Object | Contains the configuration changes. | -- |
data.newPropertyValues | Object | Configuration values after the change. | -- |
data.newPropertyValues.offsets.retention.minutes | String | New retention period of consumer offsets, in minutes. | 10078 |
data.oldPropertyValues | Object | Configuration values before the change. | -- |
data.oldPropertyValues.offsets.retention.minutes | String | Previous retention period of consumer offsets, in minutes. | 10079 |
eventType | String | Event type. | Notification |
instanceId | String | Instance ID. | alikafka_serverless-cn-xxxxxx |
instanceName | String | Instance name. | test |
uploadTime | Number | Upload timestamp in epoch milliseconds. | 1758100169280 |
Instance:Upgrade:Version
Fires when an instance version is upgraded. The payload shows the old and new minor version and open source version.
Event level: INFO
Event type: Notification
Example payload:
{
"ver": "1.0",
"status": "Normal",
"instanceName": "alikafka_post-cn-xxxx",
"resourceId": "acs:alikafka:cn-huhehaote:175923598xxxx:instance/alikafka_post-cn-xxxx",
"content": {
"data": {
"newPropertyValues": {
"serviceMiniVersion": "5.1.1.1",
"openSourceVersion": "2.2.0"
},
"oldPropertyValues": {
"serviceMiniVersion": "5.0.3",
"openSourceVersion": "2.2.0"
}
},
"eventType": "Notification",
"instanceId": "alikafka_post-cn-xxxx",
"instanceName": "alikafka_post-cn-xxxx",
"uploadTime": 1758161213151
},
"product": "kafka",
"time": 1758161213000,
"level": "INFO",
"regionId": "cn-huhehaote",
"id": "38ae38e6-c152-4371-9b4b-**********3",
"groupId": "0",
"name": "Instance:Upgrade:Version"
}content fields:
| Field | Type | Description | Example |
|---|---|---|---|
data | Object | Contains the version changes. | -- |
data.newPropertyValues | Object | Version values after the upgrade. | -- |
data.newPropertyValues.serviceMiniVersion | String | New minor version of the instance. | 5.1.1.1 |
data.newPropertyValues.openSourceVersion | String | Open source version, which corresponds to the major version of the instance. | 2.2.0 |
data.oldPropertyValues | Object | Version values before the upgrade. | -- |
data.oldPropertyValues.serviceMiniVersion | String | Previous minor version of the instance. | 5.0.3 |
data.oldPropertyValues.openSourceVersion | String | Previous open source version. | 2.2.0 |
eventType | String | Event type. | Notification |
instanceId | String | Instance ID. | alikafka_post-cn-xxxx |
instanceName | String | Instance name. | alikafka_post-cn-xxxx |
uploadTime | Number | Upload timestamp in epoch milliseconds. | 1758161213151 |
Instance:Upgrade:Version:Notify
Fires when an instance version upgrade notification is sent.
Event level: WARN
Event type: Notification
Example payload:
{
"ver": "1.0",
"id": "8582C4CB-1869-41AA-8091-73BEE551D2C3",
"requestId": "41F12ADB-370B-4D7E-B74D-464E8D5D8B1D",
"eventTime": "1742975149000",
"product": "kafka",
"resourceId": "alikafka_post-cn-xxxx",
"level": " WARN",
"instanceName": "alikafka_post-cn-xxxx",
"regionId": "cn-hangzhou",
"name": "Instance:Upgrade:Version:Notify",
"userId": "278549724441440438",
"status": "Normal",
"content": {
"version": "v2"
}
}content fields:
| Field | Type | Description | Example |
|---|---|---|---|
version | String | The instance version. | v2 |