×
Community Blog Apache RocketMQ 5.5.0 Open Source LiteTopic: Dedicated Channel for Millions of AI Sessions

Apache RocketMQ 5.5.0 Open Source LiteTopic: Dedicated Channel for Millions of AI Sessions

This article introduces Apache RocketMQ 5.5.0's LiteTopic, a new message model designed for millions of lightweight AI agent sessions with event-driven distribution and session persistence.

Apache RocketMQ 5.5.0 has been officially released. One of the important features of this version is that the new message model LiteTopic defined by the community proposal RIP-83 enters the open source version. LiteTopic is designed for AI agents, asynchronous tasks, and massive lightweight sessions. LiteTopic supports the coexistence of millions of lightweight session channels. LiteTopic is designed for lightweight channel management, consumption status persistence, and event-driven distribution. Previously, Alibaba Cloud Message Queue for Apache RocketMQ has provided capabilities for AI communication scenarios. With LiteTopic entering the open source version, this message model is also open source for developers around the world for the first time.

1. Industry Is Converging: Agent Asynchronous Communication Becomes a Consensus

Before and after the release of RocketMQ 5.5.0, the evolution of protocols and frameworks in the AI industry is also pointing in the same direction.

Anthropic MCP 2026 Roadmap pay close attention to Transport scalability, Agent Communication, task lifecycle management and session state externalization. The core contradiction is that there are natural conflicts between stateful Agent sessions and SLB and horizontal expansion, while production-level lifecycle semantics such as retry after task failure and cleanup after result expiration also need more stable run time support.

Google ADK also released a Long Running Agent solution, pointing out that the standard dialogue loop is not suitable for all long-term task scenarios, and the system needs more explicit state persistence, event-driven wake-up, and multi-agent delegation collaboration.

Two paths that converge to the same set of infrastructure requirements:

  1. Massive Session Channels: Each Agent session requires an independent communication channel and cannot be lost due to horizontal scaling.
  2. Status persistence and resumable upload: After a service restart or cold start, it needs to be able to accurately recover from the last interruption.
  3. Asynchronous lifecycle management: Task failures include retries, expired results are reclaimed, and scheduling is event-driven rather than client polling.

These three requirements are not only the protocol layer or the framework layer, but also impose higher requirements on the messaging infrastructure.

2. LiteTopic: Open Source Answers to Messaging in the AI Era

In the AI era, the Message Queue for Apache RocketMQ community launched the RIP-83(Lite Topic: A New Message Model) proposal in 2025 to address the needs of massive and lightweight sessions, state persistence, and event-driven scheduling. With the release of 5.5.0, LiteTopic has entered the open source version.

Why can't I use a combination of traditional topics and consumer groups to solve this problem? The core reasons are: when each agent session uses an independent group to subscribe to an independent topic, the number of read requests will increase rapidly with the number of topics. Broker polls full topics to check for new messages, and the CPU overhead of invalid scanning will increase linearly with the scale. Traditional topics need to be created in advance and cannot be well adapted to dynamic allocation on demand. However, the group model is naturally biased towards shared consumption and is not fully suitable for session-level exclusive subscription.

LiteTopics are designed to bypass these limitations. Three industry needs, corresponding to three types of optimization directions:

1

Requirement 1 → Parent Topic + Dynamic Subtopic

LiteTopic adopts the two-layer structure of "parent topic namespace + light sub-topic session channel". In addition to isolating namespaces, parent topics also assume centralized control of the same business domain to solve the problem of service discovery in scenarios with a large number of topics. Sub-LiteTopics support on-demand creation and lightweight management. RocksDB is used at the underlying layer to replace traditional ConsumeQueue files. This improves the bearing capacity of a single broker for a large number of LiteTopics.

Requirement 2 → Broker-side consumer point persistence

LiteTopic stores consumption points in the Broker by using memory snapshots and incremental persistence. When an agent node resubscribes after an abnormal restart, it can continue to consume based on the saved status, reducing the complexity of maintaining the session status at the business layer. The DatabaseSessionService with Google ADK solves the same kind of problem. The difference is that LiteTopic has this kind of capability built into the MSMQ layer and does not need to be implemented separately at the business layer.

Requirement 3 → Event-driven scheduling + lifecycle self-management

In the face of a large number of low-frequency and scattered lightweight topics, if the traditional full polling method is still used, the system will consume more resources on invalid scans. LiteTopic uses the event-driven Ready Set structure to accurately wake up when a message is written or read-only event is triggered, instead of full polling. The consumption retry capability of Message Queue for Apache RocketMQ also provides a production-level failure mechanism for broker communication. The subscription relationship is closer to the client ID dimension than to the consumer group dimension. This helps reduce the overhead caused by cluster-level rebalancing when the agent node goes offline.

The core differences between LiteTopics and traditional topics are as follows:

Dimension Traditional Topics LiteTopic
Creation Method Requires pre-configuration Automatically created when the first message is sent
Index storage ConsumeQueue file (separate file for each queue) RocksDB (unified key-value management)
Bearable scale Ten thousand levels (performance begins to decline) Millions
Lifecycle Manual creation /deletion Dynamic creation and automatic recycling
Consumer Binding Consumer group cluster-level binding ClientID Client machine-level binding
Rebalancing Cluster-level global rebalancing Only a single topic binding record is updated to reduce scheduling overheads in scenarios with a large number of lightweight topics.
Message distribution Separate read requests for each topic Merge reads, event-driven

3. Deep Disassembly of Core Mechanism

2

3.1 RocksDB: A Storage Foundation for Million-Level Coexistence

Each queue of traditional topics corresponds to a ConsumeQueue file-millions of topics mean millions of small files. The root cause of performance degradation is the fragmented read and write of file systems.

LiteTopic switches the index layer to RocksDB and manages consumer offset in key-value pairs:

Write Path Unchanged: Messages are appended in the CommitLog order without changing the high-performance write path.

Unified index management: One RocksDB engine supports the coexistence of a large number of Litetopics.

With the automatic TTL reclaim mechanism, developers do not need to manually delete historical session resources over a long period of time. This also reduces the resource usage caused by the continuous accumulation of historical sessions. This makes LiteTopic more suitable for carrying a large number of lightweight and dynamic session channels, and also more suitable for long-running AI application scenarios.

3.2 Event-driven Ready Set: Reduce Scheduling Transitions for Invalid Scans

The traditional Pop mode is driven by long polling. Consumers periodically initiate requests to the Broker. The Broker traverses the full topic to check for new messages. If you still use this read mode for millions of LiteTopics, full scan is inefficient and consumes a large amount of CPU resources.

5.5.0 introduces a dedicated LiteTopic-specific event-driven read mechanism:

● Broker internal maintenance "ready set"

● Put the corresponding LiteTopic into the ready queue only when a new message is written or a readable event is triggered.

● Directly locate the subscribed client when the event is triggered to implement precise wake-up on demand.

This is consistent with the design idea of Google ADK Long Running Agent "wake up only when an external event arrives"-the difference is that LiteTopic is a native mechanism at the MSMQ level.

3.3 Consumer Point Persistence and Session Continuation

3

Two key semantics of LiteTopics form the basis for the stability of Agent communication:

Subscription relationships are more similar to the semantics of single-session communication: Different from the consumption progress shared by consumer groups in the traditional consumption mode, the subscription relationships of LiteTopic are more similar to specific client connections, which helps reduce the system scheduling complexity in scenarios where a large number of concurrent sessions are used. The broker delivers messages only for the binding connection. The agent node updates only the binding record and does not trigger cluster-level rebalancing. This is the core guarantee for system stability in scenarios where millions of concurrent sessions are initiated.

Consumer point broker persistence: The memory snapshots and incremental data are stored on the broker. After the agent is restarted abnormally, the broker automatically locates the breakpoint and continues to deliver the data. You do not need to implement status management at the business layer.

4. Run Multi-Agent Asynchronous Communication in Five Minutes

Take the "Master Agent Distribute Tasks → Execute Agent Processing → Asynchronous Result Return" scenario as an example.

Note: For more information about configuration items, startup methods, and management commands, see Apache RocketMQ 5.5.0 official documentation and examples. The full example reference rocketmq-clients LiteProducerExample and LitePushConsumerExample in the repository.

Step 1: Start RocketMQ 5.5.0

#1. Download and Unzip
wget https://mirrors.aliyun.com/apache/rocketmq/5.5.0/rocketmq-all-5.5.0-bin-release.zip
unzip rocketmq-all-5.5.0-bin-release.zip && cd rocketmq-all-5.5.0-bin-release

#2. Append the LiteTopic configuration to the broker.conf
cat >> conf/broker.conf << EOF

enableLmq=true
enableMultiDispatch=true
storeType=defaultRocksDB
EOF

#3. Start the NameServer
nohup sh bin/mqnamesrv &

#4. Start the Broker
nohup sh bin/mqbroker -n localhost:9876 -c conf/broker.conf &

#5. Start the Proxy
nohup sh bin/mqproxy -n localhost:9876 &

#6. Create a LiteTopic
sh bin/mqadmin updateTopic -b localhost:10911 -t AGENT_TASK_NS -a +message.type=LITE

#7. Create a LiteGroup
sh bin/mqadmin updateSubGroup -b localhost:10911 -g executor-group -o true --attributes +lite.bind.topic=AGENT_TASK_NS

Step 2: Master Agent delivery tasks

// The parent topic is used as the namespace (shared by all LiteTopics).
static final String PARENT_TOPIC = "AGENT_TASK_NS";

Producer producer=provider.newProducerBuilder()
.setClientConfiguration(clientConfig)
.setTopics(PARENT_TOPIC)
.build();

// setLiteTopic() specifies the dedicated channel of the target agent, without pre-creation.
Message task=provider.newMessageBuilder()
.setTopic(PARENT_TOPIC)
.setLiteTopic("TASK_" + executorAgentId)
.setBody(taskPayload.getBytes(StandardCharsets.UTF_8))
.build();

producer.send(task);

Step 3: Execute Agent Subscription and Process

LitePushConsumer worker=provider.newLitePushConsumerBuilder()
.setClientConfiguration(clientConfig)
.setConsumerGroup("executor-group")
.bindTopic(PARENT_TOPIC)
.setMessageListener(msg -> {
String result = llmService.invoke(new String(msg.getBody()));
replyProducer.send(buildReply(PARENT_TOPIC, "RESULT_" + sessionId, result));
return ConsumeResult.SUCCESS;
})
.build();

// Dynamically subscribe to an exclusive LiteTopic. The first subscription is automatically created.
worker.subscribeLite("TASK_" + executorAgentId);
// After the downtime and restart, the broker automatically restores the consumer offset and continues to deliver data from the breakpoint.

The parent topic AGENT_TASK_NS serves as a namespace. Each execution agent exclusively uses one LiteTopic to receive tasks, and the results are asynchronously returned through another LiteTopic. Dynamic offline and offline do not trigger cluster rebalancing. The consumer offset is automatically renewed after downtime, and the TTL is automatically cleared.

const consumer = await new LitePushConsumerBuilder()
  .setClientConfiguration({ endpoints, namespace, sessionCredentials })
  .setConsumerGroup(consumerGroup)
  .bindTopic(inboundTopic)
  .setMessageListener({
    async consume(messageView: MessageView) {
      await messageHandler({ accountId, messageView, cfg });
      return ConsumeResult.SUCCESS;
    }
  })
  .startup();

5. Alibaba Cloud Commercial Edition: Enterprise-level Enhancement on Open Source

Open source 5.5.0 provides the core models of LiteTopic: lightweight storage, event-driven distribution, and session resume on the broker side. Alibaba Cloud Message Queue for Apache RocketMQ provides the following enhancements to meet the requirements of enterprise-level AI production:

Capabilities Open Source 5.5.0 Alibaba Cloud Business Edition
Millions of LiteTopics
Automatically created when the first message is sent
Session Continuation
Event-Driven Ready Set
Wildcard subscription
Consume Suspend
Serverless-based Scalability None ✅Millions of throughput auto scaling per minute, pay-as-you-go
EventBridge multi-source data ecosystem integration None ✅AI Data Integration, real-time data insights, and real-time agent contexts

LiteTopic also provides the Consum Suspend capability. Under the background of the shortage of GPU resources, the AI service platform can realize fine-grained traffic control of "thousands of people and thousands of faces" based on this capability, and achieve hierarchical services and priority resource scheduling. The dilemma of traditional throttling is that the slow task of a single user blocks the message processing of other users, and the throttling wait directly blocks the consumption thread, which affects the overall throughput. In addition, the pre-created High, Mid, and Low queues are too coarse in granularity. After you create a separate channel for each user or session in a LiteTopic, the throttling policy can be implemented independently. AI Service Platform implements smooth throttling by using the Consumption Suspend (Suspend) mechanism. Instead of rudely rejecting overrun requests, the throttling policy is postponed to the next time window.

4

Throttling Level Suspend Duration Usage notes
Mild Current Restriction 50 ~ 200 ms Burst throttling and smoothing to the next window
Moderate Current Limit 200 ~ 1000 ms Significantly reduce the rate of consumption
Severe throttling 1 ~ 5 s Approaching suspension of consumption
Free-busy scheduling 5 min ~ 8 hrs Batch Task Idle Processing

At present, the Alibaba Cloud model service platform Bailian gateway has built asynchronous event workflows based on LiteTopic, and managed the peak traffic and computing power scheduling of AI inference requests in the production environment with "thousands of people and thousands of faces" flow control.

6. Building AI-native Messaging Infrastructure with the Community

5.5.0 is an important step for Apache RocketMQ to move towards AI-native. Anthropic and Google define asynchronous agent communication standards at the protocol layer and framework layer, MSMQ provide run time assurance at the infrastructure layer-two layers that complement each other.

It is worth noting that MCP has set up Triggers & Events Working Group to complement the "Server active push" capability, which is precisely the natural capability of the MSMQ publish /subscribe model. At the same time, MCP is promoting Server Registry (similar to A2A Agent Card), which means that MCP is evolving to a common Agent communication protocol.

The next phase direction the community is exploring:

LiteTopics as an MCP session externalization layer: Persistence, asynchronously deliverable session state management for MCP Streamable HTTP

Interconnection between LiteTopic and MCP Tasks: Explore Message Queue for Apache RocketMQ Transport for MCP transmission, task scheduling, and status maintenance.

Integration with open source general-purpose agent frameworks such as Openclaw, Hermes Agent, and QwenPaw: Litetopic is used to extend the agent channel and upgrade the conversational agent to an event-driven agent, which is more deeply embedded in enterprise business processes.

If you are interested in RocketMQ for AI direction, welcome to participate in the following ways:

● Participate in RIP-83 and subsequent RIP discussions and contribute to design ideas for AI Agent communication scenarios

● Submit an issue, PR, or share a use case on GitHub

● Join the DingTalk group for technical communication.

5

References

● Apache RocketMQ 5.5.0 Release Notes: https://github.com/apache/rocketmq/releases/tag/rocketmq-all-5.5.0
● Proposition RIP-83 Original: https://github.com/apache/rocketmq/wiki/RIP%E2%80%9083-Lite-Topic:-A-New-Message-Model
● Alibaba Cloud RocketMQ for AI Solution: https://www.aliyun.com/solution/tech-solution/rocketmq-for-multi-agent-communication
● Alibaba Cloud RocketMQ for AI Content Collection: https://www.aliyun.com/activity/middleware/ai-mq
● Apache RocketMQ Chinese community: https://rocketmq-learning.com/

0 1 0
Share on

You may also like

Comments

Related Products