Apache RocketMQ EventBridge: Build the Next Generation of Event-driven Engines

By Lin Shen, Apache RocketMQ PMC and Alibaba Cloud Intelligent Technology Expert

What Is the Event?

What is an event? When it comes to event-driven architecture, we often focus on the word architecture. However, the true charm of event-driven architecture lies in the word event. Today, let's explore the concept of events. Previously, RocketMQ was known as a messaging engine. So why did we introduce events in the 5.0 version? And what is the difference between a message and an event?

According to the dictionary, an event refers to something that has happened in the past, especially something significant. For example, my phone just ringed. It’s an event that has already occurred.

But what sets events apart from messages? This definition of events may seem less clear. Can events be understood as messages? For instance, if someone sends me a text message, is it an event or a message?

To understand the relationship between messages and events, let's refer to the diagram below. There are two types of messages: command messages and event messages.

What is the command message? Let's look at the picture on the left below. A command message is an operational command sent from an external system to the system
What is the event message? Look at the picture on the right below. An event message is generated by the system in response to an external command that brings about changes

Events differ slightly from messages. An event can be considered a special type of message. So, what makes an event special? It mainly includes four aspects:

Feature 1: Occurred and Immutable

An event must have already occurred. What does occurred mean? It means the event is immutable. We cannot change the past unless we have superpowers. This feature is crucial because it allows us to trust and analyze events with confidence. Received events have definitely happened and cannot be modified.

In contrast, a command message represents an expectation. We know that expectations may or may not be fulfilled. For example:

• Turn on the kitchen lights.
• Press the doorbell.

These are commands that are expected to happen, but whether they actually occur is unknown.
Events, on the other hand, are things that have happened, such as:

• The kitchen lights have been turned on.
• Someone has rung the doorbell.

Feature 2: Unexpected

The second feature of events is that they are unexpected. An event is an objective description of a change in the state or property value of something, but there are no expectations regarding how to handle the event itself. In contrast, commands have expectations, expecting the system to make changes. However, an event is simply an objective description of a system change. Let's take an example. When a traffic light changes from green to red, it is an event. The event itself does not have any expectations, such as instructing pedestrians or cars to stop crossing the road. Instead, it is the traffic laws that require the traffic light and provide rules for it.

Therefore, systems generally do not send events directly and individually to a designated system. Instead, they inform the event center as a whole. The event center contains various events reported by different systems. The system explains the following things to the event center: what events its own system will generate, the format of these events, which events other systems can subscribe to if they are interested, and the true value of the event lies in the event consumer. If an event consumer wants to see what changes occurred in a system, they can subscribe to these events. Therefore, events are driven by consumers.

What is the difference between consumer-driven events and messages? The sending and subscription of command messages are mutually agreed upon without the involvement of a third party. They are often sent and subscribed in the form of documents or codes. This process is often driven by the producer.

To make an analogy, events are like a market economy. The specific value and significance of goods largely depend on their consumers. We can see all kinds of events in the system as goods on the shelf. In contrast, command messages are like a planned economy. They are born with a strong purpose and determine who can consume them.

Feature 3: Naturally Ordered and Unique

The third feature of events is that they are naturally ordered and unique. If the same entity cannot have both A and B at the same time, there must be a sequential relationship. In this case, the two events must belong to different event types. Astute readers may have noticed an additional attribute of events. Because events are naturally ordered and strongly bound to a specific moment on the timeline, they cannot occur simultaneously, making them unique.

If we encounter two events with the same content, it means that it has occurred twice, once before and once after. This is very valuable for dealing with data eventual consistency and analyzing system behavior. What we see is not just the final result of the system, but a series of intermediate processes.

Feature 4: Figurative

The fourth feature of events is that they are figurative. Events record the scene as comprehensively as possible because they do not know how consumers will use them. Therefore, they provide as much detail as possible, including when it happened, who generated the event, what type of event it is, who sent the event, what is the uniqueness flag of the event, and what is the content of the event.
Let's compare this to commonly seen messages. Due to the generally determined upstream and downstream, these messages are often streamlined for performance and transmission efficiency. They simply focus on meeting the needs of consumers.

What Is Event-driven Architecture?

Let's discuss what event-driven architecture is. To understand it quickly, let's use a simple example. We all know that after the trading system completes an order transaction, there are many systems that "need" to be aware of this order information.

• Logistics system: It needs this order information to arrange delivery.

• Points system: It needs this order information to recalculate the user's points.

• Marketing system: It needs this order information to calculate the real-time transaction volume of the day.

Here, we have three ways to integrate the upstream trading system with the downstream logistics, points, and marketing systems.

Upstream and Downstream Integration

Method 1: Actively Call

One of the simplest implementations is for the trading system to call each system in turn to send out the order information. For example, the method is shown in the following figure.

But we all know that this design is not good. Especially when more systems are added, not only does it increase development costs, but if one of the systems encounters a problem and is not handled properly, it can easily affect the transmission of other systems.

Method 2: Asynchronous Message Decoupling

A natural solution is to send the order information to a message broker service. Then, the logistics, points system, and marketing systems only need to subscribe to these transaction order messages from the broker. This method is very simple and clear.

Method 3: Event-driven Architecture

What should we do with event-driven architecture? The transaction system still sends transaction orders to the broker service, but the downstream service no longer needs to subscribe to the transaction orders in the broker. Instead, the broker often pushes the orders to the downstream system. At this point, you may have some questions. This seems similar to method 2. Is event-driven architecture simply changing from pull mode to push mode?

Here, let's focus on the upstream and downstream aspects to see what changes event-driven architecture brings.

Downstream-oriented

1. Expansion of Coupling

In many cases, the downstream marketing system relies on more than just the order data generated by a trading system. For example, it may also need to consider trading orders from Company A, Company B, and Company C in order to calculate a current summary value in real time. In the message asynchronous decoupling architecture, the customer's marketing system needs to do two things:

First, subscribe to three broker services to obtain transaction order data from Company A, Company B, and Company C.

Second, since the transaction order data formats of Company A, Company B, and Company C are different, the customer's marketing system needs to adapt to these three formats. The formats need to be converted into the expected data format within the marketing system before processing.

Furthermore, if the order data format of a certain platform changes, the customer's marketing system must be updated accordingly.

What should the customer do with the marketing system in an event-driven architecture? They don't need to do anything but specify the required order event format and provide an API. Other systems will send the data to the customer's marketing system according to this format. EventBroker will then convert the upstream events into the data format required by the customer's marketing system and send them to the provided API. Regardless of the number of system trading orders received or how the external system changes, the marketing system will remain unchanged.

2. Coupling Direction

Let's analyze the coupling relationship of these three methods. It's important to note that coupling is directional.

• Method 1: Direct call. Upstream A depends on downstream B. (If downstream B changes, upstream A needs to change synchronously.)
• Method 2: Asynchronous decoupling of messages. B depends on A. (If the data format of upstream A changes, downstream B needs to change synchronously.)
• Method 3: Event-driven. A does not depend on B, and B does not depend on A. (All couplings are handled and maintained by the intermediate Event Broker.)

3. What Are the Factors that Affect the Stability of the System?

In addition to reducing dependence, what is the most important concern when developing downstream systems? For most business scenarios, the most important factors are low development and maintenance costs, stability, and reliability. However, in the message asynchronous decoupling architecture, have you noticed that there are two entry points that affect changes in the current downstream system? (The brown part in the image) One is the API and the other is message subscription.

Having two entry points in a system can cause changes and be very troublesome. This means that when we develop and maintain the stability of this system, we need to consider both entry points. We have to consider aspects such as authentication, auditing, security, throttling, testing, and maintenance from both sides. It is costly and problematic.

4. Testability and Maintainability

In the event-driven architecture pattern, the downstream system only needs to provide an API entry.

• External: This API is not only used to receive upstream events but also to communicate with other systems.
• Internal: This API is designed based on the current domain model of the downstream system and does not need to be adapted to any other system.

Therefore, the entire system is kept simple. The advantage of simplicity is that when we need to make changes to the system, we only need to ensure that the provided API is reliable. This greatly improves testability and maintainability.

5. Serverless

Another great advantage of event-driven services is the ability to use events to drive serverless services and consume data. In the scenario of trading orders:

• For small businesses with few orders, it is not efficient to deploy a standalone points system service that runs continuously. By adopting the event-driven mode, where the downstream serverless service is triggered only when a transaction order event occurs, costs can be significantly reduced through pay-as-you-go payment.

• Merchants with large order volumes, especially during peak traffic periods like holidays, can greatly improve the system's peak processing capacity by using event-driven mode to trigger serverless computing.

• Furthermore, in the event of stability issues with the downstream system due to abnormal events, the event-driven serverless mode can provide good isolation and ensure fast recovery.

Serverless has become an unstoppable trend in the cloud-native era, and event-driven and serverless are a perfect combination.

Upstream-oriented

SaaS Integration

The previous discussion focused on the downstream, but what is the significance of event-driven for the upstream system? Let's think about what the upstream system is most concerned about. It is not primarily concerned with system stability and decoupling. That's not to say that these things are not important, but for the upstream system, there is little difference between sending data to a message broker or an event broker. So what is most important for the upstream system? Essentially, the upstream system wants to integrate with more systems to create its own niche.

How can we understand this? Let's take the access control system as an example.

The access control system is sold to different companies and needs to synchronize employee clock-in records with the ERP systems of these companies. If the system has to integrate and adapt to each company's ERP system one by one, it would be very costly and almost unrealistic. On the other hand, if it doesn't integrate, many companies may not purchase the system.

Therefore, for the upstream system, the most important thing is to easily integrate with products within the ecosystem. In the event-driven architecture mode, the access control system only needs to record employee clock-in events as events and hand them over to the event center. The event center is responsible for integrating and connecting with the downstream ecosystem.

Additionally, the access control system also needs to know about the entry events of new employees in order to identify them in a timely manner. By adopting the event-driven mode, the access control system can easily establish its own niche from scratch in the SaaS ecosystem.

How to Build a Good Event-driven Engine?

After discussing event and event-driven architecture, you must have learned about the charm of event-driven architecture and why it is preferred by more and more companies.

Finally, let's talk about the capabilities required to build a good event-driven engine. How does RocketMQ EventBridge accomplish this?

What Capabilities Are Needed?

First, we need to establish an event standard. Since an event is not intended for one person but for everyone, it is important to standardize the definition of events so that everyone can understand them at a glance.

Second, we need an event center where all kinds of events registered by the system are stored. Similar to a marketplace filled with various products, the event center contains different types of events. Anyone can browse through them, even if they don't buy anything. If there are any events needed, they can be purchased.

Third, we need an event format to describe the specific content of events. This is equivalent to a sales contract in a market economy. The format of events sent by producers must be determined and should not frequently change. The format in which consumers receive events must also be determined. Otherwise, confusion will arise in the entire market.

Fourth, we need to provide consumers with the ability to deliver events to the target. Before delivery, events can be filtered and transformed to adapt to the parameter format required by the target API. We refer to this process as a subscription rule.

Fifth, we also need a place to store events, which is the middle event bus.

How to Describe an Event

The previously mentioned event standard is crucial. The event standard is equivalent to the language used for communication between different systems. If the language is not universal, communication problems may arise. We recommend using the open-source CloudEvents protocol of CNCF, which has been widely integrated by many companies and has become a de facto standard. The CloudEvents protocol is also very simple. Below is a simple example. For more information, refer to the official website [1]:

{
  "specversion":"1.0",
  "type":"com.github.pull_request.opened",
  "source":"https://github.com/cloudevents",
  "subject":"123",
  "id":"A234-1234-1234",
  "time":"2018-04-05T17:31:00Z",
  "comexampleextension1":"value",
  "comexampleothervalue":5,
  "datacontenttype":"text/xml",
  "data":"<much wow=\"xml\"/>"
}

Event Center

Furthermore, it is essential to have an event center. The event center plays a crucial role in the event-driven architecture. It can be compared to a large hypermarket in a market economy, where all events have detailed instructions for use. Anyone can come in and have a look. Once they decide, they can immediately purchase the product.

When it comes to managing the event center, there is much to learn from API management. This includes registration, schema description, sampling, documentation, SDK, testing, and monitoring. Similarly, an event in the event center requires registration, schema description, sampling, documentation, CodeBinding, testing, and monitoring.

By following this approach, consumers can understand what the event is, how to use it, and feel confident when utilizing it.

Schema

An event's schema describes its attributes and meanings. Why do we need to introduce a schema? On one hand, we want downstream users to understand the event's format better for efficient utilization. On the other hand, we aim to restrict the formats sent by the upstream. Both sending and modifying the format must be compatible, and once a contract is established, it cannot be easily modified. We recommend using Json Schema and OpenAPI 3.0 for this purpose.

Event Filtering and Transformation

The RocketMQ event-driven engine provides various methods for event filtering and conversion. Instead of going into detailed explanations here, you can refer to the figure above for more information.

Technical Architecture of RocketMQ EventBridge

Finally, we have introduced a new event-driven product called EventBridge in RocketMQ. The architecture can be divided into two parts: the control plane at the top and the data plane at the bottom.

Control plane: The control plane focuses on upstream event management. Through EventSource, we manage events generated by the upstream, enabling easy discovery and understanding of necessary events. Regarding downstream users, we utilize EventRule to facilitate event conversion into the required formats and push them accordingly.

The EventBus in the middle stores events. At the bottom, we have our own RocketMQ broker.

Data plane: The data plane serves as an event channel. In addition to sending events to the EventBus through the API, we can also use Source Connector to pull events into the EventBus. Once an EventRule is created, consumers can use the Sink Connector to push the event to the desired destination.

Furthermore, we will provide event tracking, event playback, event analysis, and event archiving.

Welcome to join us.

RocketMQ EventBridge: https://github.com/apache/rocketmq-eventbridge

Community