ApsaraMQ for RocketMQ provides built-in diagnostics that analyze log files and backend services to identify exceptions and suggest fixes. The diagnostic process sends probe packets to your instance but does not affect your configurations or workloads.
| Tool | What it does |
|---|---|
| Log diagnostics | Scans .log files for known exception patterns |
| Backend service diagnostics | Checks the backend service of a specific instance |
| Topic access topology | Maps producers and consumers connected to a topic (5.x only) |
Supported regions
Troubleshooting is available in these regions: China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), and China (Chengdu).
Prerequisites
Before you begin, make sure that you have:
An ApsaraMQ for RocketMQ instance in a supported region
(RAM users only) Permissions for instance diagnostics granted to your RAM user. For details, see Example 4: Grant a RAM user all permissions on instance diagnostics
Run log diagnostics
Upload a .log file from your client application. The diagnostic system scans the log for known exception patterns and returns a report with suggested actions.
Log on to the ApsaraMQ for RocketMQ console. In the top navigation bar, select the region where your instance resides, such as China (Hangzhou).
In the left-side navigation pane, choose RocketMQ Copilot > Troubleshooting.
On the Troubleshooting page, click Log Diagnostics. In the Log section, upload a log file with the
.logextension, then click Submit Diagnostics.NoteThe log file cannot exceed 64 MB.
Wait for the diagnostic task to complete on the Diagnostic Report page. To check results later, click View Later and Exit, then go to the Diagnostic History page. After the task status changes to Complete, click Details in the Actions column to view the report.
Run backend service diagnostics
This tool diagnoses the backend service of a specific instance to identify infrastructure-level issues.
Log on to the ApsaraMQ for RocketMQ console. In the top navigation bar, select the region where your instance resides, such as China (Hangzhou).
In the left-side navigation pane, choose RocketMQ Copilot > Troubleshooting.
On the Troubleshooting page, click Backend Service Diagnostics, configure the required parameters, then click Submit Diagnostics.
5.x instances
Parameter Description Example Instance The instance ID to diagnose rmq-cn-vkl42***** Time Range The time window to analyze 2024-12-24 10:30:31 - 2024-12-24 11:30:31 4.x instances
Parameter Description Example Instance The instance ID to diagnose rmq-cn-vkl42***** Topic Name The topic to diagnose test_topic SDK Type The SDK protocol: HTTP SDK or TCP SDK TCP SDK Time Range The time window to analyze 2024-12-24 10:30:31 - 2024-12-24 11:30:31 Wait for the diagnostic task to complete on the Diagnostic Report page. To check results later, click View Later and Exit, then go to the Diagnostic History page. After the task status changes to Complete, click Details in the Actions column to view the report.
Query topic access topology
Identify the producers and consumers connected to a specific topic. This helps detect unexpected clients or missing connections.
Only 5.x instances support this feature.
Log on to the ApsaraMQ for RocketMQ console. In the top navigation bar, select the region where your instance resides, such as China (Hangzhou).
In the left-side navigation pane, choose RocketMQ Copilot > Troubleshooting.
On the Troubleshooting page, click Topic Access Topology, configure the required parameters, then click Submit Diagnostics.
Parameter Description Example Instance The instance ID to query rmq-cn-vkl42***** Topic Name The topic to query testTopic Time Range The time window to query 2024-12-24 10:30:31 - 2024-12-24 11:30:31 Wait for the diagnostic task to complete on the Diagnostic Report page. To check results later, click View Later and Exit, then go to the Diagnostic History page. After the task status changes to Complete, click Details in the Actions column to view the report.
Log diagnostics: issues and suggested actions
After a log diagnostic completes, the report lists detected issues with suggested actions. Find your issue below.
Authentication and access errors
Invalid AccessKey ID
Cause: The client is configured with an incorrect identity credential.
Resolution:
5.x instances: Verify the instance username. Go to the Access Control page for your instance in the console, then check the Intelligent Authentication tab.
4.x instances: Verify the AccessKey ID. For details, see Create an AccessKey pair.
Invalid AccessKey Secret
Cause: The client is configured with an incorrect secret credential.
Resolution:
5.x instances: Verify the instance password. Go to the Access Control page for your instance in the console, then check the Intelligent Authentication tab.
4.x instances: Verify the AccessKey Secret. For details, see Create an AccessKey pair.
Signature algorithm not found
Cause: The JDK version or operating system is missing the required signature algorithm, or a dependency conflict exists.
Resolution: Check the JDK and OS versions and whether a dependency conflict exists in your project.
Message send and receive errors
Failed to send message
Cause: The topic may not exist, or a network or backend service issue is preventing message delivery.
Resolution:
In the console, go to Instances > your instance > Topics and verify the topic exists.
If a network or backend service exception also appears in the report, follow the corresponding suggestion.
If the issue persists, submit a ticket.
Failed to pull message
Cause: The topic or consumer group may not exist, or a network or backend service issue is blocking message retrieval.
Resolution:
In the console, go to Instances > your instance > Topics and verify the topic exists.
Go to Groups and verify the consumer group exists.
If a network or backend service exception also appears in the report, follow the corresponding suggestion.
If the issue persists, submit a ticket.
Failed to obtain route information
Cause: The client cannot resolve the topic route, typically because the topic does not exist.
Resolution:
In the console, go to Instances > your instance > Topics and verify the topic exists.
If a network or backend service exception also appears in the report, follow the corresponding suggestion.
If the issue persists, submit a ticket.
Message type mismatch
Cause: The message type configured for the topic in the console differs from the message type used in your code.
Resolution: Compare the message type configured for the topic in the console with the message type your producer code specifies. Update one to match the other.
Invalid message attribute
Cause: A message attribute in your code conflicts with a system-reserved attribute.
Resolution: Check the error logs to identify the conflicting attribute. For the full list of reserved attributes, see Internal attributes.
Invalid scheduled time for scheduled messages
Cause: The scheduled delivery time falls outside the allowed range for your instance version.
Resolution: Adjust the scheduled time, or upgrade your instance to extend the scheduling range. For supported intervals, see Quotas and limits.
Request code not supported
Cause: The broker returned error code 320, which means batch message processing is not supported.
Resolution: Send messages individually instead of in batches.
Consumer errors
Consumption exception
Cause: The consumer encountered an error while processing messages.
Resolution: Review the consumer code for unhandled exceptions. If the issue persists, submit a ticket.
Failed to acknowledge consumption
Cause: The client could not confirm message consumption, possibly due to a network or backend service issue.
Resolution: If a network or backend service exception also appears in the report, follow the corresponding suggestion. Otherwise, submit a ticket.
Full local message cache
Cause: Messages are arriving faster than the consumer can process them, causing the local cache to fill up.
Resolution: Review the message consumption logic. Look for blocking operations, slow database queries, or insufficient consumer instances that could reduce throughput.
Consumer group not found
Cause: The consumer group that the client references does not exist in the instance.
Resolution: In the console, go to Instances > your instance > Groups and verify the consumer group exists.
No online consumer client
Cause: No active consumer client is connected to the consumer group.
Resolution:
5.x instances: In the console, go to Instances > your instance > Groups > your group. On the Group Details page, click the Running Information tab and check the Client Connection section.
4.x instances: In the console, go to Instances > your instance > Groups > your group. On the Group Details page, check the Client Connection Information section.
Verify that the expected consumer clients appear and check their consumption status.
Inconsistent subscriptions
Cause: Consumers within the same consumer group are subscribed to different topics or using different filter expressions.
Resolution: In the console, go to Instances > your instance > Groups > your group. On the Group Details page, check the Subscriptions section. Make sure all consumers in the group subscribe to the same topics. For details, see Subscriptions.
Subscription does not exist
Cause: The subscription relationship is missing, possibly because the consumer client is offline or the network is unreachable.
Resolution: Verify that the consumer client is running and that the network connection to the instance is stable.
Connection and heartbeat errors
Failed to send heartbeat
Cause: The client cannot send heartbeat signals to the broker. The topic or consumer group may not exist, or a network issue may be present.
Resolution:
In the console, verify the topic and consumer group exist.
If a network or backend service exception also appears in the report, follow the corresponding suggestion.
If the issue persists, submit a ticket.
Failed to disconnect client
Cause: The client failed to cleanly disconnect from the broker.
Resolution: If a network or backend service exception also appears in the report, follow the corresponding suggestion. If this error occurs only during application shutdown, it is safe to ignore.
Network error
Cause: A network-level failure occurred between the client and the broker.
Resolution: Submit a ticket.
Local request missing
Cause: The client may be stuck (for example, due to full garbage collection pauses), or messages are being retransmitted due to network instability.
Resolution: If this occurs occasionally, it is safe to ignore. If it happens frequently, investigate GC behavior and network stability on the client host.
Resource and throttling errors
Topic not found
Cause: The topic that the client references does not exist in the instance.
Resolution: In the console, go to Instances > your instance > Topics and verify the topic exists.
Throttling triggered on broker
Cause: The instance has reached its TPS limit, causing the broker to reject requests.
Resolution: In the console, go to Instances > your instance. On the Instance Details page, check the Dashboard tab for production TPS, consumption TPS, and the number of throttled requests. Upgrade the instance configuration to handle the current workload. For details, see Upgrade or downgrade instance configurations.
System errors
Unrecorded logs exist
Cause: Some log entries are not recognized by the diagnostic system.
Resolution: Submit a ticket so the log patterns can be added.
Backend service exception
Cause: An internal service error occurred on the broker side.
Resolution: Submit a ticket.
Backend service diagnostics: issues and suggested actions
After a backend service diagnostic completes, the report identifies issues organized by check item.
Instance issues
Instance does not exist
Cause: The specified instance ID does not match any existing instance.
Resolution: Verify the instance ID. In the console, go to Instances and confirm the correct ID.
Instance not running
Cause: The instance exists but is not in a running state.
Resolution: In the console, check the instance status on the Instances page. If the instance is stopped, start it. If the status is abnormal, submit a ticket.
Topic issues
Topic does not exist
Cause: The specified topic has not been created in the instance.
Resolution: In the console, go to Instances > your instance > Topics and check whether the topic is created.
Broker issues
Machine failure
Cause: A physical machine in the broker cluster has failed. This may cause temporary network errors in client logs.
Resolution: These errors are transient and typically resolve on their own. No action is required.
Broker update released
Cause: Maintenance is being performed on the backend broker cluster. Network interruptions lasting a few seconds may occur, which can cause temporary errors in client logs.
Resolution: If you received an advance maintenance notification, no action is required -- the errors are expected and transient. If no notification was received, submit a ticket.