Apply these configuration and design patterns to build reliable, high-performance ApsaraMQ for Kafka consumers. This guide covers consumer group setup, offset management, rebalance prevention, error handling, and performance tuning.
How consumer groups work
ApsaraMQ for Kafka distributes messages across consumers through consumer groups. Each consumer in a group is assigned a subset of partitions from the subscribed topics, and each partition is consumed by exactly one consumer in the group.
The consumption cycle follows three steps:
Poll messages from assigned partitions.
Process the messages (run your consumption logic).
Poll again.
To place multiple consumers into the same group, set their group.id to the same value. ApsaraMQ for Kafka then evenly distributes partitions across those consumers so that each message is delivered to exactly one consumer in the group.
For example, if Group A subscribes to Topic A and consumers C1, C2, and C3 are started in the group, each message that Topic A receives is delivered to only one of C1, C2, or C3. By default, ApsaraMQ for Kafka evenly delivers messages to consumers to balance the consumption load.
A rebalance -- the process of redistributing partitions across consumers -- is triggered whenever consumers are first started, restarted, added, or removed.
The number of consumers in a group must not exceed the number of partitions. Any consumer without an assigned partition remains idle.
Set the partition count
The partition count determines the maximum number of concurrent consumers for a topic.
By default, ApsaraMQ for Kafka sets the partition count to 12 in the console, which is sufficient for most workloads. If you need higher throughput, increase the count gradually within the recommended range of 12 to 100:
| Partition count | Impact |
|---|---|
| Fewer than 12 | May degrade production and consumption performance. |
| 12 to 100 | Recommended range for most workloads. |
| More than 100 | May trigger frequent consumer rebalances. |
You cannot reduce the partition count after an increase. Increase partitions in small increments.
Prevent frequent rebalances
Rebalances pause consumption across the entire consumer group while partitions are reassigned. Frequent rebalances are typically caused by heartbeat timeouts: the broker considers a consumer dead when it stops receiving heartbeats within the session timeout window.
To prevent rebalances, you can adjust the related parameters or increase the consumption rate. For detailed troubleshooting, see Why do rebalances frequently occur on my consumer client?
Choose a subscription mode
ApsaraMQ for Kafka supports two subscription patterns.
One group subscribes to multiple topics
A single consumer group can subscribe to multiple topics. Messages from all subscribed topics are evenly distributed across the consumers in the group. For example, if Group A subscribes to Topic A, Topic B, and Topic C, messages from all three topics are evenly consumed by consumers in Group A.
String topicStr = kafkaProperties.getProperty("topic");
String[] topics = topicStr.split(",");
for (String topic: topics) {
subscribedTopics.add(topic.trim());
}
consumer.subscribe(subscribedTopics);Multiple groups subscribe to one topic
Multiple consumer groups can independently subscribe to the same topic. Each group receives all messages from that topic, and the delivery processes are independent of each other. For example, if Group A and Group B both subscribe to Topic A, each message that Topic A receives is delivered to consumers in both Group A and Group B independently, without mutual impact. This pattern is useful when different applications need to process the same data stream separately.
Use one group per application
Use a separate consumer group for each application. Each application typically has distinct consumption logic, so sharing a group across applications leads to unpredictable behavior.
If you need different consumption logic within the same application, create separate kafka.properties files (for example, kafka1.properties and kafka2.properties) and initialize each consumer with its own configuration.
Keep subscriptions consistent
Within a consumer group, we recommend that all consumers subscribe to the same set of topics. Inconsistent subscriptions across consumers in the same group can cause unexpected partition assignment and make troubleshooting difficult.
Consumer offsets
Every partition tracks a maximum offset -- the total number of messages produced to that partition. Each consumer tracks a consumer offset -- the position of the last consumed message. The difference between these two values represents the message backlog (accumulated unconsumed messages).
Choose an offset commit strategy
ApsaraMQ for Kafka supports three approaches for committing consumer offsets. The strategy you choose directly affects delivery semantics and data correctness.
Automatic commit
Automatic commit is enabled by default and controlled by two parameters:
| Parameter | Default | Description |
|---|---|---|
enable.auto.commit | true | Enables automatic offset commits. |
auto.commit.interval.ms | 1000 | Interval in milliseconds between automatic commits. |
Before each poll() call, the client checks whether the time since the last commit exceeds auto.commit.interval.ms. If so, it commits the current offset.
With auto-commit enabled, make sure all messages from the previous poll() are fully processed before calling poll() again. Otherwise, unprocessed messages are marked as consumed and skipped.
Trade-off: Auto-commit is simple but offers no guarantee that processing completed before the offset was committed. After a restart, some messages may be reprocessed (at-least-once delivery) or skipped (if processing failed before commit).
Synchronous manual commit
For precise control, disable auto-commit and call commitSync() after processing each batch:
// Set enable.auto.commit=false in your consumer properties
while (true) {
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records) {
// Process the record
}
consumer.commitSync(); // Commit only after successful processing
}Trade-off: Synchronous commits guarantee that offsets are committed only after processing completes. The consumer blocks until the commit succeeds, which adds latency. commitSync() also retries automatically until it either succeeds or encounters an unrecoverable error, so no messages are lost.
Asynchronous manual commit
Use commitAsync() for lower latency when you can tolerate occasional commit failures:
// Set enable.auto.commit=false in your consumer properties
while (true) {
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(100));
for (ConsumerRecord<String, String> record : records) {
// Process the record
}
consumer.commitAsync((offsets, exception) -> {
if (exception != null) {
log.error("Commit failed for offsets: {}", offsets, exception);
}
});
}Trade-off: Asynchronous commits do not block the consumer, which improves throughput. However, failed async commits are not retried automatically. If a commit fails and the consumer restarts, some messages may be reprocessed.
Summary: offset commit strategies
| Strategy | Delivery semantics | Latency | Complexity |
|---|---|---|---|
| Auto-commit | At-least-once (possible message skip on failure) | Lowest | Lowest |
| Synchronous manual | At-least-once (no skip) | Higher | Medium |
| Asynchronous manual | At-least-once (possible reprocessing on commit failure) | Low | Medium |
Configure offset reset behavior
Consumer offsets are reset in two situations:
No offset has been committed to the broker -- for example, when a consumer group connects for the first time.
The consumer tries to fetch from an invalid offset -- for example, the stored offset exceeds the current maximum.
Control reset behavior with auto.offset.reset on Java clients:
| Value | Behavior |
|---|---|
latest | Start consuming from the most recent message. |
earliest | Start consuming from the oldest available message. |
none | Throw an exception instead of resetting. |
Set this parameter to
latestto avoid consuming the entire message history when an offset becomes invalid.If you use manual commits with
commit(offsets), you can safely set this tonone.
Handle large messages
Consumers pull messages from the broker. For messages larger than 1 MB, tune these parameters to control the pull rate:
| Parameter | Recommendation |
|---|---|
max.poll.records | Set to 1 if each message exceeds 1 MB. |
fetch.max.bytes | Set slightly larger than the maximum expected message size. |
max.partition.fetch.bytes | Set slightly larger than the maximum expected message size. |
With these settings, the consumer pulls large messages one at a time, preventing memory issues.
Deduplicate messages
ApsaraMQ for Kafka uses at-least-once delivery semantics: every message is delivered at least once, guaranteeing no data loss. However, network errors or client restarts can cause a small number of messages to be delivered more than once.
For applications sensitive to duplicates (such as transaction processing), implement idempotent consumption:
On the producer side: Set a unique key for each message that serves as a transaction ID.
On the consumer side: Before processing a message, check whether its key has already been processed. Skip the message if it has; otherwise, process it exactly once.
If your application tolerates occasional duplicates, no deduplication logic is needed.
Handle consumption failures
Messages within a partition are consumed sequentially. If processing fails for a specific message (for example, due to corrupt data), use one of these strategies:
| Strategy | How it works | Trade-off |
|---|---|---|
| Retry with backoff | Retry the failed message using exponential backoff. | Blocks the consumer on the current message. If the failure persists, messages accumulate. |
| Dead-letter topic | Forward failed messages to a dedicated topic for later analysis. The consumer continues with subsequent messages. | Requires a separate topic and periodic review process. |
Topic A (source) --> Consumer --> Processing
|
(on failure)
|
v
Dead-letter topic --> Manual reviewApsaraMQ for Kafka does not provide built-in dead-letter queue semantics. Implement this pattern by producing failed messages to a separate topic and reviewing them periodically.
Implement exception handling within your consumer's message processing code. Use patterns like circuit breaker or exponential backoff to handle transient errors. Unhandled exceptions can crash the consumer and cause excessive rebalancing.
Prevent message accumulation
Message accumulation is the most common consumer-side issue. Two root causes:
Consumption rate is lower than production rate
Add more consumers or consumption threads to increase throughput. See Increase consumption rate.
Consumer thread is blocked
When a consumer waits indefinitely for a remote call during message processing, the consumption thread blocks and messages accumulate.
Solution: Set a timeout for all remote calls. If no response arrives within the timeout window, treat the message as a consumption failure and move on to the next message.
Increase consumption rate
Add consumers
Start additional consumers within the same group. Each consumer runs in its own thread or process, and ApsaraMQ for Kafka automatically rebalances partitions across them.
Adding consumers beyond the number of partitions does not increase throughput. Excess consumers remain idle.
Add consumption threads
Use a thread pool to parallelize message processing within a single consumer:
Define a thread pool.
Poll a batch of messages.
Submit messages to the thread pool for concurrent processing.
After all messages in the batch are processed, poll again.
This approach increases throughput without requiring additional partitions.
Filter messages
ApsaraMQ for Kafka does not provide built-in message filtering. Two alternatives:
| Approach | When to use |
|---|---|
| Multiple topics | Route different message types to separate topics. Works best for a small number of distinct categories. |
| Client-side filtering | Filter messages in your consumer code based on content or headers. Works best for many categories. |
Combine both approaches as needed.
Simulate message broadcasting
ApsaraMQ for Kafka does not support native message broadcasting. To deliver every message to multiple independent consumers, create a separate consumer group for each consumer. Each group independently receives all messages from the subscribed topic.
Reduce consumption latency
Latency stays low as long as consumers keep up with the production rate. If latency increases, check for message accumulation and increase the consumption rate.