Symptoms
When a consumer client uses the StickyAssignor partition assignment strategy, multiple consumer threads consume the same partition. This causes duplicate or out-of-order message processing.
Cause
This is a known bug (KAFKA-7026 / KIP-341) in Apache Kafka client versions earlier than 2.3. The StickyAssignor does not deduplicate partition assignments when a consumer rejoins a group with stale assignment data.
The following scenario reproduces the issue:
Consumer C1 joins a consumer group as the leader and is assigned partition
test-0.Consumer C2 joins the same group. C1 retains
test-0; C2 receives no partitions.C1 becomes unresponsive (for example, due to a long GC pause). C2 becomes the new leader and takes over
test-0.C1 recovers and rejoins the group with its stale assignment (
test-0). Both C1 and C2 reporttest-0as their existing assignment during the rebalance.The
StickyAssignordoes not check for duplicates, so it assignstest-0to both consumers.
Solution
Option 1: Upgrade the Kafka client to 2.3 or later (recommended)
The bug is fixed in Apache Kafka 2.3. Upgrade the Kafka client dependency in your application:
<!-- Maven example -->
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>2.3.0</version> <!-- or later -->
</dependency>Option 2: Switch to a different partition assignment strategy
If you cannot upgrade immediately, switch to a different partition assignment strategy. The following table describes the available strategies:
| Strategy | Class name | Description | Trade-offs |
|---|---|---|---|
| Range (default) | RangeAssignor | Distributes partitions of each topic evenly across consumers. | Simple and predictable. Can result in uneven distribution when the partition count is not a multiple of the consumer count. |
| Round-robin | RoundRobinAssignor | Assigns partitions one by one in a round-robin fashion across all consumers. | More balanced than Range. May cause more partition movements during rebalances. |
| Cooperative sticky | CooperativeStickyAssignor | Same balancing logic as StickyAssignor, but uses the cooperative rebalance protocol to avoid stop-the-world rebalances. | Minimizes partition movements and avoids the duplicate-assignment bug. Requires a two-step migration from eager-protocol assignors. |
To change the strategy, set the partition.assignment.strategy consumer configuration property:
props.put(ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG,
"org.apache.kafka.clients.consumer.RoundRobinAssignor");Migrate to CooperativeStickyAssignor
To migrate from StickyAssignor to CooperativeStickyAssignor in a running consumer group without downtime, perform a two-step rolling restart:
Add
CooperativeStickyAssignoras a secondary strategy alongside the current one. Then perform a rolling restart of all consumers.props.put(ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG, "org.apache.kafka.clients.consumer.StickyAssignor," + "org.apache.kafka.clients.consumer.CooperativeStickyAssignor");After all consumers pick up the new configuration, switch to
CooperativeStickyAssignoronly. Then perform another rolling restart.props.put(ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG, "org.apache.kafka.clients.consumer.CooperativeStickyAssignor");
Do not useStickyAssignoron client versions earlier than 2.3. Even after the fix,CooperativeStickyAssignoris generally the better choice because it supports incremental rebalancing without pausing all consumers.