Deep discussion on the relationship between Service Mesh and API Gateway-Alibaba Cloud Developer Community


regarding the relationship between Service Mesh and API Gateway, this question has been frequently asked in the past two years, and many articles and materials in the community have also provided answers. Among them, some Internet celebrities like Christian Posta have given in-depth introductions. Here, I will do a summary of the materials and give some opinions based on my personal understanding. In addition, at the end of this article, some groundbreaking practices and explorations of Ant Financial in the latest field of Service Mesh and API Gateway integration are introduced, hoping to give everyone a more perceptual understanding.

Note 1: To save space, we will go straight to the topic, assuming that readers have a basic understanding of Service Mesh and API Gateway. Note 2: This article focuses more on combing the whole context, and the content will not be particularly detailed, especially the parts that have been elaborated in other articles. If you want to know more about the details after browsing this article, please continue to read the last reference materials and recommended reading of the article.

Originally clear boundaries: positioning and responsibilities

first of all, there is a very clear boundary between Service Mesh and API Gateway in functional positioning and responsibilities:

as shown in the preceding figure:

in terms of functions and responsibilities:

  • at the bottom layer, the split atomic microservices provide various capabilities in the form of services;
  • it is an optional combination service on the original microservices. In some scenarios, it is necessary to combine the capabilities of several Microservices to form new services;
  • atomic microservices and composite services are deployed in internal system , in the case of adopting Service Mesh, the ability of Service Mesh to provide inter-Service communication;
  • API Gateway is used to expose these services within the system external System to accept external requests in the form of APIs;

in terms of deployment:

  • Service Mesh is deployed inside the system because atomic microservices and composite services are usually not directly exposed to external systems;
  • API Gateway is deployed at the edge of the system: On the one hand, it is exposed outside the system and provides APIs for external systems to access; On the other hand, it is deployed inside the system to access various internal services.

Here, two widely used terms are introduced:

  • east-West communication: refers to the mutual access between services. The communication traffic flows between services and is located inside the system;
  • north-South communication: refers to the service providing access to the outside, usually through the API provided by the API Gateway to the outside Paul, whose communication flow is from the outside of the system to the inside of the system;

explain the origin of "East, West, North, South": as shown in the above figure, the principle of "up north, down south, left East and right West" is usually followed habitually on the map.

Summary: Service Mesh and API Gateway have clear division of labor and clear boundaries in terms of functions and responsibilities. However, if this is the end of the matter, there will be no discussion about the relationship between Service Mesh and API Gateway, and naturally there will not be this article.

What is the root cause of the problem?

Highly Recommended reading: The article Christian Posta in the appendix "Do I Need an API Gateway if I Use a Service Mesh? "There are in-depth analyses and explanations for this.

Philosophical question: Does a gateway access internal services in East-West direction or north-south direction?

As shown in the following figure, yellow lines indicate API Gateway access to internal services:

here comes the problem. From the perspective of traffic trend: This is the service that the external traffic starts to access the external exposure after entering the system. It should belong to the "north-south" communication, as shown in the preceding figure. However, from another perspective, if we logically split the API Gateway into two parts, we first ignore the exposed parts and only look at the parts that API Gateway access internal services separately, at this time, API Gateway can be regarded as an ordinary client service, and its communication with internal services is more like "East-West" communication:

therefore, when API Gateway access internal services as a client, whether it is north-south or east-west, it becomes a philosophical problem: it completely depends on how we view API Gateway, as a whole, logically, it is divided into two parts: internal and external.

This philosophical problem is not meaningless. In various products of API Gateway, how to realize "API Gateway access internal services as a client" is usually divided into two schools:

  1. distinct: API Gateway and internal service are regarded as two independent things, API Gateway communication mechanism of accessing internal service is realized by itself, independent of the communication mechanism between services;
  2. compatibility: consider API Gateway as the client of an ordinary internal service, and reuse the communication mechanism between internal services;

and the final decision is usually related to the product positioning: If you want to maintain the independent product positioning of the API Gateway and use it under different inter-service communication schemes, you usually choose the former, A typical example is Kong; If it has a very deep connection with the inter-service communication scheme, the latter is usually selected. A typical example is Zuul and Spring Cloud in the SpringCloud Gateway ecosystem.

However, no matter which genre you choose, it cannot change the fact that when "API Gateway access internal services as a client, it does have no essential difference from a common internal service that accesses other services as a client: service discovery, load balancing, traffic routing, fusing, throttling, service degradation, fault injection, logging, monitoring, link tracing, access control, encryption, and identity authentication. When we list the functions of the gateway to access internal services, we find that almost all of these functions are the same as those called between services.

This leads to a common phenomenon: if there is a mature inter-service communication framework, it is natural to consider implementing API Gateway and reuse these repetitive capabilities. For example, Zuul in the Spring Cloud ecosystem mentioned above and Spring Cloud Gateway developed later realize the reuse of these capabilities in the way of reusing class libraries.

Here is a similar philosophical problem: when "API Gateway access internal services as a client", it implements code-level capability reuse in the way of reusing class libraries, it is equivalent to implementing a client that is exactly the same as the communication solution between common services. Is the traffic sent by this client east-west or north-south?

The answer is not important.

Sidecar: The real overlap point

after entering the era of Service Mesh, the relationship between Service Mesh and API Gateway began to be like this:

  1. clear division of functions and responsibilities;
  2. the functions of client access to services are highly overlapping;

at this time, the relationship between the two is very clear, and because Service Mesh and API Gateway are different products at that time, the overlap between the two is only in function.

With the passage of time, when Service Mesh products and API Gateway products begin to permeate each other, the relationship between the two begins to become ambiguous.

After the emergence of Service Mesh, how to choose appropriate Service Mesh solutions for API Gateway-based services has gradually been put on the agenda, and the ability to choose to reuse Service Mesh has naturally become a direction of exploration, and gradually emerging new API Gateway products, the idea is very direct:

how to integrate East-West and South-North communication solutions?

One of the methods is to realize Service Mesh based on the Sidecar of API Gateway, thus introducing Service Mesh East-West communication scheme into north-south communication. Here we will not expand the details, I quote a picture (thank you Zhao Huabing) to explain the idea of this solution:

at this time, the relationship between Service Mesh and API Gateway became interesting. Because of the introduction of Service Mesh in the Sidecar, a new solution to the previous "philosophical problem" came into being: this time, API Gateway can be split into two independently deployed physical entities instead of two logical parts:

  1. API Gateway ontology: implements API Gateway functions other than accessing internal services;
  2. sidecar: according to the standard practice of Service Mesh, we regard API Gateway as a common Service deployed in Service Mesh and Sidecar for the 1:1 deployment of this Service;

in this solution, the Service Mesh used for Sidecar is used in API Gateway, replacing various functions of the original client access in the API Gateway. This solution simplifies the implementation of API Gateway a lot, and also realizes the reuse and integration of East-west and north-south communication capabilities, while API Gateway can focus more on the core functions of "API Management.

At this time, the relationship between Service Mesh and API Gateway has changed from "distinct" to "compatibility".

Companies adopting this solution usually have Service Mesh products first, and then plan (or re-plan) Service Mesh based on API Gateway products. For example, Ant Financial's SOFA Gateway products are based on MOSN, the open-source products Ambassador and Gloo in the community are based on.

The advantages of the preceding solution lie in the independent deployment of API Gateway and Sidecar, clear responsibilities and clear architecture. However, the same as the performance overhead caused by one more hop when using Service Mesh Sidecar is questioned, API Gateway the usage Sidecar is also questioned: One more hop......

The method to solve the "one more hop" problem is simple and crude. Based on Sidecar, API Gateway functions are added. In this way, API Gateway ontology and Sidecar are merged into one again:

as for the relationship between Service Mesh and API Gateway after this step: is this Service Mesh/Sidecar integrated with API Gateway, or API Gateway integrated with Service Mesh/Sidecar? This question is just like whether the zebra is white-soled black or black-soled white. It is a matter of opinion.

BFF: integrate to the end

the introduction of BFF(Backend For Frontend) will make Service Mesh and API Gateway closer.

Let's take a look at the conventional BFF:

here, an additional BFF layer is added between API Gateway and internal services (including composite services and atomic microservices). Note that the working mode of BFF is similar to that of Combined Services, which combines multiple services. But the difference is:

  1. Combined Services also belong to the category of services, but the implementation mechanism combines multiple services, and the external exposure is still a complete and standardized service;
  2. BFF is different. BFF, as shown in its name, Backend For Frontend, is completely For the Frontend. One of the core goals is to simplify Frontend access;
  3. for today's topic, the most important point is that BFF completely blocks the inbound traffic from the outside, but does not have composite services, API Gateway can directly access atomic microservices;

"BFF completely closes external traffic", this point will become very imaginable after the integration of API Gateway and Sidecar. Let's first look at the previous integration method, in the case of BFF, the scenario after API Gateway and Sidecar are integrated:

zoom in a bit and see API Gateway and BFF separately:

note that there are two API Gateway in the request path from being received by the Sidecar to entering BFF:

  1. deployed with BFF, it is a common API Gateway without Sidecar function;
  2. after the integration of API Gateway and Sidecar, this is a "big API Gateway with Sidecar functions" (or "special Sidecar with API Gateway functions"): Although it plays a API Gateway role, but essentially, it still contains a full-featured Sidecar, which is equivalent to the Sidecar that comes with BFF;

therefore, the question arises: why do we need to put two Sidecar in the process and reduce them to one? We try to combine the two Sidecar into one, remove the Sidecar that comes with BFF, and directly give the API Gateway that plays the Sidecar to BFF:

the scenario is as follows:

  1. traffic is directly transferred to BFF (other network components may be hung in front of BFF to provide load balancing functions);
  2. the Sidecar of BFF receives traffic, completes the API Gateway function, and then transfers the traffic to BFF;
  3. BFF calls internal services through Sidecar (the same as when no merge is performed);

note that there is a key point here, which is specifically noted in the front:" BFF completely closes external traffic ". This is a prerequisite because the original API Gateway cluster no longer exists. If BFF fails to receive all traffic, the traffic that fails to be received cannot be API Gateway. Of course, if you are willing to be a little troublesome, it is also feasible to clearly define the services that need to be exposed to the outside world during deployment, and directly deploy API Gateway with Sidecar functions on these services, however, the management is more complicated than the BFF mode.

In addition, in terms of deployment, according to the above solution, we will find that the API Gateway "disappeared"-there is no longer a explicitly deployed API Gateway cluster, conventional centralized gateways are integrated into each BFF instance in this solution, thus realizing another important feature: decentralization.

The above-mentioned scheme of Service Mesh and API Gateway integration does not stay on paper.

Within Ant Financial, we have carried out pioneering practices and explorations based on the integration and decentralization of Service Mesh and API Gateway. Take Alipay mobile gateway as an example. In the past ten years, the gateway has experienced from single to microservices, from centralization to decentralization, and from shared gateway. The jar package uses MOSN to implement the Mesh/Sidecar of the Gateway:

we strongly recommend that you read the article "thinking and practice Jia Dao Ant Financial" API Gateway Mesh by my colleagues in the appendix.


this paper summarizes the relationship between Service Mesh and API Gateway. On the whole, the positioning and responsibilities of the two are "distinct". However, in terms of specific implementation, the trend of integration begins to appear: in the early days, the traditional method was class library-level code reuse. The latest trend is to combine API Gateway and Sidecar into one.

The development of the latter has just started, and we have just begun to explore this direction, including in Ant Financial. However, we believe that more similar product forms may appear in the community in the next one or two years.

I would like to add the MOSN mentioned many times in this article ":

MOSN is short for Modular Open Smart Network. It is a Network proxy software developed in Go language. It is Open-source by Ant Financial and has been verified by hundreds of thousands of containers at the production level. As a cloud-native network data plane, MOSN aims to provide services with multi-protocol, modular, intelligent, and secure proxy capabilities. MOSN can be integrated with any xDS API that supports Service Mesh, and can also be used as independent layer -4 and layer -7 load balancing, API Gateway, cloud-native Ingress, etc.

  • GitHub:
  • official website:

appendix: References and recommended reading

if you are still not satisfied, please continue to read the following content.

Sort by article publishing time:

  • The Difference Between API Gateways and Service Mesh: 2020-02, instructing The architect to determine when to use API Gateway and when to use Service Mesh. Author Marco Palladino, from kong.
  • Do I Need an API Gateway if I Use a Service Mesh? : 2020-01, author Christian Posta, for the Chinese translation version, please refer to Ma Ruofei. Do I still need API Gateway after using Service Mesh? For the comparison between Service Mesh technology and API Gateway, the functional overlap and divergence of the two are emphatically analyzed, which provides guidance for technology selection and implementation.
  • Thinking and practice of Ant Financial API Gateway Mesh: 2019-12, author Jia Dao, introduces the development and API Gateway Mesh of Ant Financial Alipay Gateway, and strongly recommends reading, this article clearly introduces the practice of Ant Financial in integrating Servicemesh and API Gateway.
  • The identity crisis of API Gateway: 2019-05, the author of the original text Christian Posta, the translator Zhou Yuqing, telling the basic concepts of API Gateway, such as the definition of API, the meaning of API Management, the mode of API Gateway, and the relationship between service network and API Gateway.
  • Long journey: Ant Financial Service Mesh practice exploration: on October 2018, in my speech in QCon, I shared the exploration of Ant Financial's communication scope between services at that time, it was proposed to put the service grid's ability in East-West communication into north-south communication. At that time, Sidecar products based on SOFA Gateway had just begun to be developed.
  • API Gateway vs Service Mesh: September 2018, author Richard Li, CEO of Datawire, in the development Ambassador API Gateway. Ambassador is a API Gateway open source product based on Envoy. This article describes the views, differences, and integration of service grids and API Gateway.
  • DreamMesh valuable (9)-API Gateway: 2018-03, this article I also wrote, 2018 nian initial and I servicemesh community some friends of in-depth discussion after, the DreamMesh series of blog articles recorded the scheme conceived at that time, especially the detailed discussion of the API gateway and sidecar. At that time, the idea was not mature enough, but the general direction had already taken shape. Thank the students who participated in the discussion at that time!
  • Service Mesh vs API Gateway: 2017-10, the original author Kasun Indrasiri, and the Chinese version translated by Zhao Huabing. The article is not long, mainly comparing the product functions of Service nets and API Gateway, A combination of the two methods is proposed-calling downstream services through service grids in API Gateway.
  • Application Network Functions With ESBs, API Management, and Now.. Service Mesh? : 2017-08, author Christian Posta, describes the relationship between service mesh and ESB, message proxy, and API management. The content is very good, and I strongly recommend reading (I have to spit it out: The picture is too hot for eyes).
Selected, One-Stop Store for Enterprise Applications
Support various scenarios to meet companies' needs at different stages of development

Start Building Today with a Free Trial to 50+ Products

Learn and experience the power of Alibaba Cloud.

Sign Up Now