Prevent Duplicate Messages with Idempotent Consumption - ApsaraMQ for RocketMQ

ApsaraMQ for RocketMQ delivers messages at least once. This means a consumer may receive the same message more than once. If your business logic is sensitive to duplicates -- such as payment deductions, inventory adjustments, or order creation -- implement idempotent consumption. Idempotent consumption ensures that processing the same message multiple times produces the same result as processing it once.

For example, a consumer processes a payment deduction message for an order of USD 100. Due to a network issue, the message is delivered twice. With idempotent consumption, the payment is deducted only once, and only one deduction record of USD 100 is generated for the order.

Why duplicate messages occur

Duplicate messages arise from three scenarios.

Producer retry

A producer sends a message, and the ApsaraMQ for RocketMQ broker persists it. However, the broker may fail to acknowledge the producer due to a transient network issue or a producer crash. The producer treats this as a failed send and retries. The consumer then receives two messages with the same content but different message IDs.

Broker redelivery

A consumer receives and processes a message, but the acknowledgment back to the broker fails due to a transient network issue. Because the broker cannot confirm whether the message was consumed, it redelivers the message after the network recovers to honor the at-least-once guarantee. The consumer then receives two messages with the same content and the same message ID.

Load balancing

Events such as network jitters, broker restarts, or consumer application restarts trigger load balancing. During rebalancing, a consumer may receive messages that were already delivered.

Use business keys, not message IDs

A natural first instinct is to deduplicate based on message ID, but this is unreliable. As described in the producer retry scenario, the same logical message can arrive with two different message IDs when the producer retries a send. Deduplicating on message ID would miss these duplicates entirely.

Instead, assign a unique business identifier as the message key. For example, use an order ID, a payment transaction ID, or any value that uniquely identifies the business operation. This key remains consistent regardless of how many times the message is sent or redelivered. It is derived from your business logic rather than generated by the messaging system.

This approach provides predictable repeatability. If a failure occurs and the message is resent, the same business context always produces the same key, making deduplication reliable.

Implement idempotent consumption

Step 1: Set the message key on the producer

Attach a unique business identifier as the message key when sending a message.

Message message = new Message();
message.setKey("ORDERID_100");
SendResult sendResult = producer.send(message);

Replace ORDERID_100 with the actual unique business identifier, such as an order ID or transaction ID.

Step 2: Retrieve the message key on the consumer

Retrieve the message key in the consumer callback and use it to enforce idempotent processing.

consumer.subscribe("ons_test", "*", new MessageListener() {
    public Action consume(Message message, ConsumeContext context) {
        String key = message.getKey()
        // Perform idempotent processing based on the message key that uniquely identifies your business.
    }
});

Step 3: Enforce idempotence with a deduplication store

The message key alone does not prevent duplicate processing -- your application must check whether the key has already been processed. A common pattern uses a relational database with a unique constraint:

Before processing, insert the message key into a deduplication table with a unique constraint on the key column.
If the insert succeeds, process the message.
If the insert fails due to a primary key or unique constraint violation, the message was already processed. Skip it.

Use the insert-then-check pattern rather than check-then-insert. With check-then-insert, two threads could both pass the check before either inserts, resulting in duplicate processing. Relying on the database's unique constraint violation is atomic and race-condition-safe.

Deduplication table example (MySQL):

CREATE TABLE message_dedup (
    message_key VARCHAR(255) NOT NULL,
    created_at  TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (message_key)
);

Idempotent consumer logic example:

consumer.subscribe("ons_test", "*", new MessageListener() {
    public Action consume(Message message, ConsumeContext context) {
        String key = message.getKey();

        try {
            // Attempt to insert the message key. Fails if already processed.
            insertMessageKey(key);
        } catch (DuplicateKeyException e) {
            // Already processed. Skip.
            return Action.CommitMessage;
        }

        // Process the business logic.
        processOrder(key);

        return Action.CommitMessage;
    }
});

Replace insertMessageKey and processOrder with your actual database and business logic methods.

For high-throughput scenarios where a relational database may become a bottleneck, use a Redis-based approach with SETNX (SET if Not eXists) instead.

Best practices

Practice	Details
Wrap deduplication and business logic in a single transaction	If the business logic fails after the deduplication record is inserted, the deduplication record is rolled back, allowing the message to be retried on the next delivery. Without a transaction, a failed business operation leaves the deduplication record in place, and the message is never reprocessed.
Clean up the deduplication store periodically	The deduplication table grows over time. Set a retention period (for example, 7 days) and delete older records in batches to prevent unbounded storage growth.
Choose the right deduplication store for your throughput	Use a relational database with unique constraints for moderate throughput. For high-throughput scenarios, use Redis `SETNX` with a TTL that matches your retention period, which also handles automatic cleanup.