DeepSeek Brings New Excitement to API Gateways

This article aims to offer a more comprehensive understanding of API gateways by discussing related concepts and how DeepSeek integrates with API gateways.

API gateways are not a new concept, but DeepSeek's trend of popularization has brought fresh excitement to the API gateway. This article aims to offer a more comprehensive understanding of API gateways by discussing related concepts, the evolution and classification of API gateways, core capabilities, and how DeepSeek integrates with API gateways.

Introduction
Related Concepts of API Gateways
Evolution and Classification of API Gateways
Core Capabilities and Application Scenarios of API Gateways
How to Integrate DeepSeek with Alibaba Cloud Native API Gateway

1. Introduction

API gateways serve as core components for managing APIs and play a crucial role in the entire architecture system. They act like an intelligent transportation hub, responsible for coordinating and managing various API requests to ensure safety and stability, enabling efficient and smooth responses. Many rigid demands from large model applications are being met through API gateways, such as:

● Supporting multiple large models in the backend as both a product experience consideration and a stability concern; this has become a standard for large model applications, whether for conversational or code-related applications.

● Whether to have networked search capabilities, as the generation quality of large models varies significantly, the frontend must expose options for networked search.

● Ensuring content output safety and compliance by implementing control before content generation.

● Semantic caching, temporarily storing API response results on a caching server so that when identical requests arrive, the responses can be fetched directly from the cache, reducing the official API call costs.

● Quota and rate limiting for callers, a mechanism for restricting the number of API calls, traffic size, or resource usage for each caller (e.g. users, applications, IP addresses) within a certain time frame.

● Backend protective rate limiting to manage traffic flows to the API, ensuring stable and efficient operations including load balancing, rate limiting, degradation, and circuit breaking capabilities.

2. Related Concepts of API Gateways

2.1 API

API (Application Programming Interface) is a set of specifications and protocols that define how different software applications or components communicate and interact with one another. APIs can be seen as middleware, allowing developers to access and use certain functionalities or data without having to understand the underlying implementation details. For example, Alibaba Cloud APIs provide developers with a series of application interfaces to manage cloud resources, data, and services. API classifications include:

The entry interface for creating APIs in the Alibaba Cloud Native API Gateway Console

HTTP API: Interfaces based on the HTTP protocol, centered around routing. Suitable for scenarios where interfaces do not have a unified standard, such as K8s Ingress, microservice architecture, and AI (SSE) situations, allowing for quick external service exposure.
REST API: RESTful style HTTP interfaces that are resource-oriented, operating through standard HTTP methods (like GET, POST, etc.), with all interfaces adhering to a unified OpenAPI specification suitable for API-first, cross-team collaboration, and refined API governance.
WebSocket Protocol Interfaces: Suitable for bi-directional real-time communication, such as AI, IoT, and instant messaging. WebSockets offer real-time data transfer capabilities and built-in long connection default configurations compared to HTTP APIs.
AI API: A type of API optimized for AI gateways, providing more user-friendly AI gateway configurations and debugging capabilities, and preset functionalities for AI proxies, AI monitoring, consumer authentication, content safety protection, and more.

2.2 API Gateway

An API gateway (APIG) is a piece of middleware that provides API hosting services. It sits between the client and backend services, serving as the sole entry point for client access to backend services. All requests from clients first pass through the API gateway, which then routes them to the backend services. It acts like a gatekeeper, responsible for identity verification, permission checks, flow control, and other actions to ensure the security and stability of API requests.

2.3 Other Concepts

Domain: The address in the browser, the starting point of the client request, such as www.xxx.com
DNS: The Domain Name System, which resolves domain names into corresponding IP addresses, facilitating mutual recognition and communication among computers on the network.
IP Address: Provides a logical address for each device on the Internet (like computers, mobile phones, routers, etc.), ensuring that data packets are accurately sent from source to destination devices.
Routing: Refers to the path selection process for data packets as they move from a source address to a destination address on the network. During routing creation, request paths, methods, parameters, and other rules can be defined to distribute requests to corresponding backend services.
Source and Services: After requests reach the API gateway, dynamic retrieval of backend service lists is needed to continue accessing services. This should support various forms like container services, Nacos, fixed addresses, DNS domains, and serverless computing to add services.
Environment: Defines different states during the API lifecycle management process, allowing APIs to be published to different custom environments (like development, testing, production) for testing and managing APIs at different stages.
API Grouping: A collection of APIs for the same business, which can be seen as a service, like a credit query service. API developers manage all APIs within the group, facilitating unified management and maintenance of similar business APIs.
Consumer: The credentials for clients accessing the API. To enable consumers, consumer authentication must be activated in the corresponding interface/route, creating an authorization relationship between the consumer and the interface/route. Once consumer authentication is turned on, only interfaces/routes authorized by consumers can be accessed with corresponding credentials.

3. Evolution and Classification of API Gateways

API gateways are not independent entities but have evolved alongside the evolution of software architecture. Software architectures have transitioned from monolithic, vertical, SOA, microservices, to cloud-native architectures. With the popularization of large models, the evolution has continued towards AI-native architectures, during which the forms of API gateways have also iterated, exhibiting different forms in various stages of software architecture.

3.1 Traffic Gateway

Responsible for managing and optimizing data traffic to enhance business scalability and high availability. Nginx, as a representative software for traffic gateways, is popular for its efficient performance and flexible configuration. The core purpose of traffic gateways is to resolve load balancing issues among multiple business nodes by intelligently distributing customer requests to different servers, thereby evenly spreading the load, avoiding single points of failure, and ensuring the stability and continuity of services.

3.2 Enterprise Service Bus (ESB) Gateway

A critical integration solution designed for enterprises aimed at standardizing and simplifying communication and message transmission between different systems and services. Following service-oriented architecture (SOA) principles, the ESB achieves rapid deployment and efficient operations of services through centralized management of message routing, transformation, and security.

3.3 Microservices Gateway

Responsible for centrally managing routing rules for microservices, enhancing system security, providing performance monitoring, and simplifying access processes to improve overall system reliability. Microservices gateways can implement load balancing, rate limiting, circuit breaking, and authentication functions, managing and optimizing interactions among different microservices through a unified entry. This not only simplifies the communication complexity between clients and microservices but also provides additional protection for system security. Spring Cloud Gateway is a widely used microservices gateway based on the Spring ecosystem, easy to integrate with Spring Boot projects, and favored by developers for its flexibility, efficiency, and scalability.

3.4 Cloud Native Gateway

An innovative gateway born from the widespread application of K8s, which requires a gateway to forward external requests to internal cluster services due to the natural isolation of networks within K8s clusters. K8s adopts Ingress/Gateway APIs for unified gateway configuration, and it provides elastic scaling to help users solve application capacity scheduling issues. Consequently, users have new demands for gateways, expecting them to possess characteristics of traffic gateways to handle massive requests, while also having features of microservices gateways for service discovery and governance, and to provide elastic scaling capabilities for capacity scheduling issues. Envoy and Higress are examples of typical open-source cloud-native gateways.

3.5 AI Gateway

We believe that the AI gateway is not a new form independent of the cloud-native gateway; it can essentially be regarded as a cloud-native gateway, with the distinction being that it has been specifically expanded to address new needs in AI scenarios. For example, it provides capabilities such as flexible switching between multiple models and retries, content safety and compliance for large models, semantic caching, multi-API Key balancing, token quota management and rate limiting, large model traffic grayscale handling, and cost auditing for calls. In the industry, Higress and Kong have evolved capabilities specifically targeted at AI scenarios on the foundation of cloud-native gateways, while others like Traefik and Cloudflare have also designed products and services for AI gateways. For the core capabilities of AI gateways, please refer to our previous article on the ten essential capabilities that an AI gateway should possess.

4. Core Capabilities of API Gateways

Due to the numerous capabilities provided by API gateways and the various roles involved, we will categorize all capabilities based on the users, including three scenarios: development, supply, and consumption. These correspond to the development teams of API interfaces, the development and operations teams of the API platform, and the external callers of the API platform.

4.1 API Development Scenarios

API First means defining the API specifications first before coding. Unlike not defining the API and coding directly, API First emphasizes designing and developing API interfaces before building applications, treating APIs as core architectural components of the system, and achieving modularity through well-defined interface specifications.

For example, public cloud products offer API calling methods, and WeChat Mini Programs and DingTalk Open Platform provide API interfaces for developers, similar to a modular LEGO system, enabling flexible combinations between services through standard interfaces, enhancing system scalability and maintainability, thereby improving ecological efficiency.

In development scenarios, API gateways can cover the entire lifecycle around APIs, including design, development, testing, publishing, selling, operations monitoring, security management, and sunset of APIs.

4.2 API Supply Scenarios

API supply scenarios refer to the process by which API providers (such as enterprises, platforms, or services) expose data or functionalities to the outside world through standardized interfaces. Its core involves creating, managing, and maintaining APIs to ensure their availability, security, and efficiency.Core capabilities include:

API Security: Protect APIs from various security threats, ensuring that only authorized users and applications can access APIs, while ensuring confidentiality, integrity, and availability of data during transmission and storage.For example, authentication, authorization management, data encryption and decryption, and anti-attack mechanisms.

Gray Release: A strategy for gradually introducing new API versions or features in a production environment, allowing a portion of users or request traffic to be directed to the new version of the API while keeping the rest on the old version, thereby enabling testing and validation of the new API without impacting overall system stability and user experience.

Caching: Refers to temporarily storing API response results in a caching server so that when identical requests arrive again, the response results can be retrieved directly from the cache without re-accessing the backend server, thus improving API response speed and system performance.

4.3 API Consumption Scenarios

API consumption scenarios refer to the process where callers (such as applications or developers) quickly implement functionalities or obtain data by integrating external APIs. The core is to use the capabilities or data provided by the platform to meet business needs.

Call Auditing: The process of comprehensively recording, monitoring, and analyzing API call activities. It records detailed information for each API call, including call time, caller identity, the API interface called, request parameters, response results, response time, and more.
Quota and Rate Limiting: Refers to a mechanism where the API gateway limits the number of API calls, traffic size, or resource usage for each caller (such as users, applications, IP addresses) within a certain time period based on preset rules.
Backend Protective Rate Limiting: Manages and controls the traffic to the API to ensure stable and efficient operation of the API, avoiding system crashes and performance degradation caused by excessive or anomalous traffic. This includes load balancing, rate limiting, degradation, circuit breaking, and other capabilities.

5. How to Integrate DeepSeek with Alibaba Cloud Native API Gateway

5.1 Prerequisites

A Virtual Private Cloud (VPC) has been created with a public NAT gateway bound to a public elastic IP. For specific operations, see Creating and Managing a Virtual Private Cloud [1], Using Public NAT Gateway SNAT Function to Access the Internet [2].
Based on the above VPC, create a cloud-native API gateway instance. For specific operations, refer to Creating Gateway Instance [3].

The following demonstrations provide three scenarios for reference:

Integrated Model Accessing AI Gateway
General Model Accessing AI Gateway
AI Gateway Achieving Multi-Model Proxy

5.2 Scenario 1: Integrated Model Accessing AI Gateway

Some large model providers have already been integrated into the Alibaba Cloud Native API Gateway, allowing these models to be accessed directly by selecting a provider and configuring the API-KEY. These include: Alibaba Cloud Bailian, DeepSeek, OpenAI, Azure, Claude, Dark Side of the Moon, Baichuan Intelligence, Zero One Everything, Zhipu AI, Hunyuan, Jueyue Star, Spark, Doubao (Volcano Engine), MiniMax, and Gemini.

Configuring AI Services

The gateway sends requests through services to create AI services using the following methods:

Log in to the Cloud Native API Gateway Console.
In the left navigation bar, select Instances and choose a region from the top menu bar.
On the instance page, click the name of the target gateway instance.
In the left navigation bar, select Services and click on the Services tab.
Click Create Service, and in the Create Service panel, configure the AI service according to the following information：

Service Source: AI Service.
Large Model Provider: Enter the corresponding model provider.
Service Address: Use the default configuration.
API-KEY: Enter the request credential API-KEY obtained from the model provider.
Example Configuration for Alibaba Cloud Bailian:
- Large Model Provider: Alibaba Cloud Bailian.
- Service Address:https://dashscope.aliyuncs.com/compatible-model/v1
- API-KEY:Enter the API-KEY obtained from Alibaba Cloud Bailian.

Configuring AI API

Return to the Cloud Native API Gateway Console homepage, and in the left navigation bar, select API.
Select the AI API tab and click Create AI API.
In the Create AI API control panel, configure the basic information for the AI API, including:

Domain Name: It is recommended to configure a domain name (there will be rate limiting under the default environment domain).
Associated Instance: Select the created instance.
AI Request Monitoring: Enable.
Service Model: Single Model Service.
Service List: Click Add to add the following services. Select the DeepSeek service configured in the previous step with Alibaba Cloud Bailian, and configure the model name to pass-through.

Debugging AI API

In the completed AI API interface, click Debug.

Specify the model as deepseek-r1 to interact with Alibaba Cloud Bailian's DeepSeek.

5.3 Scenario 2: Accessing AI Gateway via General Model

In this scenario, custom service addresses support the following cases:

For large model providers not integrated into the cloud-native API gateway, and the model supports OpenAI protocols.
For DeepSeek services deployed through Alibaba Cloud PAI or FC, etc.

In this scenario, direct integration can refer to the methods in PAI deployment model access to AI gateway [4].

5.4 Scenario 3: Multi-Model Proxy of AI Gateway

Configuring AI API

The current cloud-native API gateway simultaneously supports integration based on both integrated models and general models, providing multi-model proxy services, and supports Fallback in cases of call exceptions; in such scenarios, users use a unified calling method to invoke different third-party model services concurrently.

Based on scenarios 1 and 2, complete the configurations for three gateway AI services: Alibaba Cloud Bailian, Volcano Engine, and PAI. The service configuration for Volcano Engine can be referenced below.
When creating an AI API (or entering the editing state), configure the large model service as follows:

Service Model: Multi-Model Service (by Model Name)
Service List: Click Add to add the following multiple services.
- Select the PAI DeepSeek service configured in the previous step, with the model name matching rule configured as DeepSeek-*
- Select the Volcano Engine DeepSeek service configured in the previous step, with the model name matching rule configured as ep-*
Fallback: Enable
Fallback List: Click Add to add the following services.
- Select the DeepSeek service configured in the previous step with Alibaba Cloud Bailian, with the model name configured as deepseek-r1.

The configuration shown below will execute according to the following rules:

Call PAI DeepSeek when the model is DeepSeek-*
Call Volcano Engine DeepSeek when the model is ep-*
Call Alibaba Cloud DeepSeek in cases of error or rate limiting (if multiple Fallbacks are configured, calls will be made in order).

Debugging AI API

In the completed AI API interface, click Debug.

When specifying the model name as ep-20250219155230-28l6f or DeepSeek-R1-Distill-Qwen-1.5B, responses will be given from Volcano Engine or PAI as per the rules.

When an incorrect name is entered, and there is no corresponding DeepSeek model, Fallback will be triggered, calling Alibaba Cloud DeepSeek-R1:

In the future, we will compile experiences on the practices of various industry clients using DeepSeek + API gateways to build internal and external corporate services, and organize them into articles to be published on this public account. Everyone is welcome to subscribe and follow.

Reference Links:

[1] https://www.alibabacloud.com/help/en/vpc/user-guide/create-and-manage-a-vpc
[2] https://www.alibabacloud.com/help/en/vpc/user-guide/use-the-snat-feature-of-an-internet-nat-gateway-to-access-the-internet
[3] https://www.alibabacloud.com/help/en/api-gateway/cloud-native-api-gateway/user-guide/create-gateway
[4] https://www.alibabacloud.com/help/en/api-gateway/cloud-native-api-gateway/use-cases/pai-deployment-model-access-ai-gateway

Community

DeepSeek Brings New Excitement to API Gateways

Table of Contents

1. Introduction

2. Related Concepts of API Gateways

2.1 API

2.2 API Gateway

2.3 Other Concepts

3. Evolution and Classification of API Gateways

3.1 Traffic Gateway

3.2 Enterprise Service Bus (ESB) Gateway

3.3 Microservices Gateway

3.4 Cloud Native Gateway

3.5 AI Gateway

4. Core Capabilities of API Gateways

4.1 API Development Scenarios

4.2 API Supply Scenarios

4.3 API Consumption Scenarios

5. How to Integrate DeepSeek with Alibaba Cloud Native API Gateway

5.1 Prerequisites

5.2 Scenario 1: Integrated Model Accessing AI Gateway

Configuring AI Services

Configuring AI API

Debugging AI API

5.3 Scenario 2: Accessing AI Gateway via General Model

5.4 Scenario 3: Multi-Model Proxy of AI Gateway

Configuring AI API

Debugging AI API

Reference Links:

Read previous post:

Read next post:

Alibaba Cloud Native Community

You may also like

Comments

Alibaba Cloud Native Community

Related Products

API Gateway

AgentBay

Microservices Engine (MSE)

AI Acceleration Solution