AI-native Applications Based on Event-driven Architectures

By Hanxie

AI applications face many challenges when they are commercialized. For example, AI applications are required to deliver services in a more efficient manner, generate more real-time and accurate results, and provide more user-friendly experience. Traditional architectures support only synchronous interaction and cannot meet the preceding requirements. This article describes how to handle the preceding challenges based on event-driven architectures.

AI Application Scenarios

Before we delve into the combination of event-driven architectures and AI, let's review AI applications in the current situation.

Based on the application architectures, AI applications can be roughly divided into the following three categories:

(1) Extension applications developed based on foundation models, such as ChatGPT (text generation), Stable Diffusion (image generation), and CosyVoice (speech generation). In most cases, such applications provide relatively atomic services with model capabilities as the core.

(2) Intelligent applications based on knowledge bases, such as Langchain ChatChat. Such applications use large language models (LLMs) as the core and are developed based on the retrieval-augmented generation (RAG) technology, which are applicable to a wide range of business scenarios.

(3) Agent applications that use LLMs as the interaction center. Such applications can connect to external applications or services by calling tools and work in complex forms, such as collaboration among multiple agents. Agent applications are the most creative applications for enterprises.

What is AI-native?

The word "native" represents a broad understanding of a concept. For example, when we talk about native mobile applications, we can immediately associate the word with the applications on mobiles. When we talk about cloud-native applications, a large number of developers can immediately think of containerized applications. However, when we talk about AI-native, except for several leading applications such as ChatGPT and Midjourney, we cannot define AI-native applications as we define native mobile applications. We still cannot draw a conclusion today. We just deduce the direction of AI-native based on facts, hoping to gradually concrete the image of AI-native in our mind.

Changes Brought by AI to Application Architectures

After AI capabilities are applied to application architectures, the application architectures are greatly changed. Programming paradigms such as RAG and agent are used and traditional workflows become different after AI nodes are added to the workflows.

An AI Application Architecture That Uses RAG

An AI Application Architecture That Uses an Agent

A Workflow with AI Nodes

Change Trends in AI Applications

Based on the evolution of the forms of products developed by well-known AI enterprises, AI applications of the preceding three categories are integrated and relatively separated. AI applications gradually evolve into the applications in which an agent controls and manages the other applications in a centralized manner,

such as OpenAI Canvas, Claude Artifacts, and Vercel v0. Such applications share the common features, including intelligent kernel, multi-modal capability, and language user interface (LUI).

From another perspective, users will pay for AI-native applications only if the AI applications can provide better user experience. Distributed capabilities of foundation models and multi-modal capabilities can improve the user experience only in specific scenarios. However, in some aspects, the user experience provided by such capabilities is even inferior to that provided by traditional applications. Therefore, we need to integrate conversational interaction, AI models, and multi-modal capabilities to create applications that provide better user experience than that provided by traditional applications.

Create an AI-native Application by Using an Event-driven Architecture

We use the event-driven architecture because the sequential architecture may fail to meet business requirements in practices, and the event-driven architecture is more advanced than the sequential architecture.

Challenges in Creating an AI-native Application by Using a Traditional Sequential Architecture

Service calls in sequence cannot guarantee inference experience

The amount of time consumed by a model service for inference is much greater than that consumed for calling a traditional network service. For example, if you use Stable Diffusion to generate an image from text, the image can be generated within seconds even if the service is optimized by using algorithms. If the number of concurrent requests is large, server downtime may occur. In addition, the synthesis of speeches or virtual avatars may take minutes. In this case, the sequential architecture is obviously not suitable. The event-driven architecture can quickly respond to requests of users and call inference services based on the business requirements. This ensures user experience and reduces the risk of system downtime.

Service calls in sequence cannot meet the requirements for real-time data construction

In an intelligent conversational search system, the quality of the generated results depends on the data in the system. The timeliness and accuracy of data retrieved for conversational search greatly affect the user experience of the intelligent conversational search system. In the system architecture, conversational search and data updates are separated. It is impractical to manually update large amounts of data. We can update data in an efficient manner by configuring scheduled tasks and building workflows that are used to update data in knowledge bases. The event-driven architecture has obvious advantages in this scenario.

Service calls in sequence do not support two-way communications

In conversational search scenarios, services that behave like a human can gain favorable impression of users and expand business opportunities. The traditional Q&A application architecture is relatively rigid. AI applications use message queues to transmit information and notify users in an efficient manner. The AI chatbot can identify the intent of users and proactively communicate with users to keep the loyalty of users to services.

Practices for Creating AI-native Applications by Using Event-driven Architectures

I will share some practices for creating AI-native applications by using event-driven architectures.

Asynchronous inference architecture for Stable Diffusion

We talked about the issues that may occur when users use the text-to-image model of Stable Diffusion. To resolve these issues, we adopt an event-driven architecture. In the event-driven architecture, we use Function Compute and Simple Message Queue (SMQ, formerly MNS) to create an asynchronous inference architecture for Stable Diffusion. When a user sends a request to the text-to-image model of Stable Diffusion, the request is sent to the API proxy function through the Function Compute gateway. The API proxy function tags and authenticates the request, sends the request to the SMQ queue, and then records the metadata of the request and inference information to Tablestore. The inference function consumes the data based on the task queue, schedules the GPU instance to start Stable Diffusion, returns the image generation result, and updates the request status after the image is generated. The client notifies the user of querying the inference process by sending polling requests on the page.

Voice Agent application that supports real-time conversation

_10

This is a relatively complex application. Users can communicate with the intelligent conversational search service in real time based on speeches and receive questions that are proactively asked by the intelligent conversational search service.

_11

The application uses an event-driven architecture. An ApsaraMQ for RocketMQ client is installed on the Real-Time Communication (RTC) server and the RTC server subscribes to a topic that serves as the center topic in ApsaraMQ for RocketMQ. A scheduled task is triggered to analyze the intent of users,generate messages, and sends the messages to the topic. Then, the ApsaraMQ for RocketMQ client consumes the messages and proactively ask questions.

Real-time data streams in a knowledge base of a Voice Agent application

_12

The knowledge base is automatically updated based on the change data capture (CDC) policy. The automatic update process is triggered by a condition such as importing data from an external system data source or uploading documents to Object Storage Service (OSS). The data is split into chunks, vectorized, and stored in a vector database and a full-text search database.

Event-driven Architecture for AI-native Applications

_13

Finally, I want to share a set of solutions with AI application developers. Select foundation model services such as Ollama, ComfyUI, CosyVoice, and Embedding in Cloud Application Platform (CAP) of Alibaba Cloud and host the model services in CAP. Use services such as ApsaraMQ for RcoketMQ, ApsaraMQ for Kafka, SMQ, and EventBridge to build data stream pipelines and a message center. Use a framework such as Spring AI Alibaba to develop backend services in your local computer to implement RAG and agent capabilities. Use the Next.js framework to build an interface of the frontend service. Then, deploy the developed frontend and backend services to CAP by using Serverless Devs and tune and access the services. Finally, use cloud-native gateways to protect the production of the services. To perform long-term O&M operations on the knowledge base or agent, use Application Real-Time Monitoring Service (ARMS) to monitor metrics of the services.

This article is derived from the topic shared at the Salon for Cloud-native and Open Source Developers during the AI Application Engineering Session held in Hangzhou, China.

5113120674444587 December 11, 2024 at 9:27 am

I want to share a set of solutions with AI application developers.

: 5158455706477346 December 12, 2024 at 7:28 am

Name、Email

0

: 5158455706477346 December 12, 2024 at 8:53 am

Great write-up! Kristhian. You’ve demonstrated excellent clarity and insight, and that too in such a concise and impactful way.

1 0

: 5158455706477346 December 13, 2024 at 2:14 am

Great write-up! Kristhian. You’ve demonstrated excellent clarity and insight, and that too in such a concise and impactful way.Sure! The translation of that sentence is: "I really like this article."

0 0

Community

AI-native Applications Based on Event-driven Architectures

AI Application Scenarios

What is AI-native?

Changes Brought by AI to Application Architectures

An AI Application Architecture That Uses RAG

An AI Application Architecture That Uses an Agent

A Workflow with AI Nodes

Change Trends in AI Applications

Create an AI-native Application by Using an Event-driven Architecture

Challenges in Creating an AI-native Application by Using a Traditional Sequential Architecture

Service calls in sequence cannot guarantee inference experience

Service calls in sequence cannot meet the requirements for real-time data construction

Service calls in sequence do not support two-way communications

Practices for Creating AI-native Applications by Using Event-driven Architectures

Asynchronous inference architecture for Stable Diffusion

Voice Agent application that supports real-time conversation

Real-time data streams in a knowledge base of a Voice Agent application

Event-driven Architecture for AI-native Applications

Read previous post:

Read next post:

Alibaba Cloud Native

You may also like

Comments

5113120674444587 December 11, 2024 at 9:27 am

5158455706477346 December 12, 2024 at 7:28 am

5158455706477346 December 12, 2024 at 8:53 am

5158455706477346 December 13, 2024 at 2:14 am

5158455706477346 December 13, 2024 at 2:14 am

Alibaba Cloud Native

Related Products

E-Commerce Solution

Architecture and Structure Design

Cloud-Native Applications Management Solution

AI Acceleration Solution