When consumers fall behind producers, messages accumulate and processing latency spikes. Misconfigured offsets can silently skip or reprocess messages. This guide covers proven patterns for building reliable, high-throughput consumer applications on ApsaraMQ for Kafka -- covering consumer groups, offset management, failure handling, and performance tuning.
How message consumption works
An ApsaraMQ for Kafka consumer repeats a three-step cycle:
Poll -- Fetch a batch of messages from the broker.
Process -- Run your business logic on the fetched messages.
Poll again -- Fetch the next batch after processing completes.
Keep processing time short to maintain a steady poll interval. Long processing delays trigger rebalances or cause message accumulation.
Consumer groups and load balancing
A consumer group is a set of consumer instances that share the same group.id. ApsaraMQ for Kafka distributes the partitions of subscribed topics evenly across consumers in a group, so each message is delivered to exactly one consumer.
Example: Group A subscribes to Topic A. Three consumers -- C1, C2, and C3 -- are in the group. Each incoming message goes to one of C1, C2, or C3, balancing the consumption load across all three.
ApsaraMQ for Kafka triggers a rebalance when consumers are:
First started
Restarted
Added to or removed from the group
Prevent frequent rebalances
Rebalances pause consumption while partitions are reassigned. Frequent rebalances disrupt throughput and increase processing latency. They occur when consumer heartbeats time out.
To prevent frequent rebalances, adjust the relevant parameters or increase the consumption rate. For detailed troubleshooting steps, see Why do rebalances frequently occur on my consumer client?
Configure partitions
The partition count controls how many consumers can process messages in parallel. Each partition is assigned to exactly one consumer within a group -- if you have more consumers than partitions, the extra consumers sit idle.
Recommended partition count
The default partition count in the ApsaraMQ for Kafka console is 12, which works for most workloads. When scaling, follow these guidelines:
| Partition count | Impact |
|---|---|
| Fewer than 12 | May reduce message production and consumption performance |
| 12 -- 100 | Recommended range for most workloads |
| More than 100 | May trigger frequent consumer rebalances |
You cannot reduce the partition count after an increase. Increase partitions incrementally rather than in large jumps.
Subscription patterns
ApsaraMQ for Kafka supports two subscription patterns.
One group subscribes to multiple topics
A single consumer group can subscribe to multiple topics. Messages from all subscribed topics are distributed evenly across consumers in the group.
String topicStr = kafkaProperties.getProperty("topic");
String[] topics = topicStr.split(",");
for (String topic: topics) {
subscribedTopics.add(topic.trim());
}
consumer.subscribe(subscribedTopics);Multiple groups subscribe to one topic
Multiple consumer groups can independently subscribe to the same topic. Each group receives all messages, and the groups operate independently without affecting each other.
Use this pattern when different applications need to process the same data stream independently -- for example, one group for real-time analytics and another for data archiving.
Keep one group per application
Use a dedicated consumer group for each application. If a single application needs to run different consumption logic, create separate configuration files (for example, kafka1.properties and kafka2.properties) with distinct group.id values.
Match subscriptions within a group
All consumers in the same group should subscribe to the same set of topics. Mismatched subscriptions complicate troubleshooting and can lead to unexpected rebalance behavior.
Manage consumer offsets
Each partition tracks a maximum offset -- the total number of messages received. Each consumer tracks a consumer offset -- the number of messages it has processed in that partition. The difference between the two is the message accumulation (unconsumed backlog).
Auto-commit vs manual commit
ApsaraMQ for Kafka provides two parameters for committing consumer offsets:
| Parameter | Default | Description |
|---|---|---|
enable.auto.commit | true | Enables automatic offset commits |
auto.commit.interval.ms | 1000 (ms) | Interval between automatic commits |
With auto-commit enabled, the client checks the time since the last commit before each poll. If the elapsed time exceeds auto.commit.interval.ms, it commits the current offset.
When using auto-commit, make sure all messages from the previous poll are fully processed before the next poll. If the consumer writes to an external datastore and that write fails, auto-commit may still advance the offset -- causing those messages to be skipped permanently rather than retried. Understand this tradeoff before relying on the default behavior.
Commit offsets manually
To commit offsets manually, set enable.auto.commit to false and call the commit(offsets) function after processing each batch.
Offset reset behavior
Consumer offsets are reset in two situations:
No committed offset exists -- for example, when a consumer connects to a broker for the first time.
The committed offset is invalid -- for example, the maximum offset in a partition is 10, but the consumer tries to read from offset 11.
Control the reset behavior on Java clients with auto.offset.reset:
| Value | Behavior |
|---|---|
latest | Start from the most recent message. Use this as your default |
earliest | Start from the oldest available message |
none | Throw an exception instead of resetting |
Prefer latest over earliest to avoid reprocessing the entire topic history when an invalid offset occurs. If you commit offsets manually with the commit(offsets) function, none is a safe choice since your offsets should always be valid.
Handle large messages
When individual messages exceed typical sizes, adjust these parameters to control fetch behavior:
| Parameter | Guidance |
|---|---|
max.poll.records | Maximum records per poll. Set to 1 for messages larger than 1 MB |
fetch.max.bytes | Maximum data per fetch request. Set slightly larger than the expected message size |
max.partition.fetch.bytes | Maximum data per partition per fetch. Set slightly larger than the expected message size |
With these settings, the consumer fetches large messages one at a time, preventing memory pressure from oversized batches.
Handle message duplication
ApsaraMQ for Kafka uses at-least-once delivery semantics: every message is delivered at least once to prevent data loss, but duplicates can occur during network errors or client restarts.
Implement idempotent consumption
If your application is sensitive to duplicates -- for example, financial transactions or order processing -- implement deduplication at the application level:
Attach a unique key to each message when producing (for example, a transaction ID).
Before processing a consumed message, check whether the key has already been processed.
Skip messages with keys that have already been processed.
If your application tolerates occasional duplicates (for example, metrics aggregation or log ingestion), skip this step.
Handle consumption failures
Messages within a partition are consumed sequentially. When processing fails -- for example, due to malformed data or a downstream service outage -- choose one of these strategies:
| Strategy | Tradeoff |
|---|---|
| Retry in place | Retry the failed message until it succeeds. Simple to implement, but blocks the consumer thread and causes message accumulation on that partition |
| Dead-letter topic | Send failed messages to a dedicated topic for later inspection. Consumption continues without blocking, but requires a separate process to investigate and reprocess failures |
For most production workloads, the dead-letter approach avoids stalling the consumer pipeline.
Reduce consumption latency
ApsaraMQ for Kafka uses a pull-based model: consumers fetch messages from the broker on each poll cycle. When consumers keep pace with producers, latency stays low. A spike in latency usually indicates message accumulation.
Common causes of accumulation
| Cause | Solution |
|---|---|
| Consumption rate is slower than production rate | Add consumers or increase processing throughput |
| Consumer thread is blocked by a slow remote call | Set timeouts on external calls to release the thread after a bounded wait |
Increase consumption throughput
Add more consumers. Start additional consumer instances (one thread per consumer) or deploy more consumer processes. The number of active consumers cannot exceed the partition count -- extras sit idle.
Use a thread pool for processing. Decouple polling from processing to parallelize work:
Define a thread pool with a fixed number of worker threads.
Poll a batch of messages.
Submit each message to the thread pool for concurrent processing.
After all tasks in the batch complete, poll the next batch.
This pattern keeps the poll loop responsive while distributing processing work across multiple threads.
Filter messages
ApsaraMQ for Kafka does not provide built-in message filtering. Implement filtering at the application level using one of these approaches:
| Approach | When to use |
|---|---|
| Separate topics | A small number of distinct message categories |
| Client-side filtering | Many categories, or filtering logic that changes frequently |
Combine both approaches when your use case requires it -- route broad categories to different topics, then apply fine-grained filters on the consumer side.
Broadcast messages
ApsaraMQ for Kafka does not natively support message broadcasting. To deliver every message to multiple independent consumers, create a separate consumer group for each consumer that needs to receive all messages. Each group independently consumes the full message stream from the topic.