All Products
Search
Document Center

Cloud Network Well-architected Design Guidelines:Application delivery network design for ACK

Last Updated:Dec 18, 2025

Overview

Summary

This topic describes how to design a network ingress for Kubernetes clusters, especially on how to design the Service and Ingress components. In network configurations, the efficiency and security of access between services in a Kubernetes cluster and access to services in the Kubernetes cluster are highly important. Proper understanding and implementation of the design principles of the Service and Ingress components not only increase application availability, but also greatly improve the overall system performance and stability.

Terms

Kubernetes: Kubernetes is an open source container orchestration engine that automates the deployment, scaling, and management of containerized applications. This open source project is hosted by Cloud Native Computing Foundation (CNCF).

Container Service for Kubernetes (ACK ): ACK is one of the first services to participate in the Certified Kubernetes Conformance Program in the world. ACK provides high-performance containerized application management services to allow enterprises to manage the lifecycle of containerized applications and efficiently deploy containerized applications in the cloud.

image

Service: In Kubernetes, a Service is an abstraction that helps you expose a pod or a group of pods over a network. A Service identifies a group of pods based on label selectors, and manages traffic distribution among pods based on the Service type, such as ClusterIP, NodePort, LoadBalancer, or ExternalName. Services effectively fix issues caused by direct access to pods and ensures high availability and efficiency of applications. By default, Services use the TCP protocol. You can use other supported protocols. Services are mainly used to process east-west traffic.

Ingress: Ingresses make HTTP or HTTPS services available by using a protocol-aware configuration mechanism that understands web concepts such as URIs, hostnames, paths, and more. The Ingress concept allows you to map traffic to different backends based on rules that you define by using the Kubernetes API. Ingresses can provide load balancing, SSL termination, and name-based virtual hosting. An Ingress controller is responsible for fulfilling the Ingress, usually with a load balancer, although Ingress controllers may also configure your edge router or additional frontends to help handle the traffic. Ingresses are mainly used to process north-south traffic.

image

Container Network Interface (CNI) plugins: CNI plug-ins are responsible for the implementation of the container network. The CNI plug-in that you use determines how to allocate IP addresses to pods, whether to use overlay networks, how to forward traffic within the cluster, and how to manage access to pods. Well-known open source CNI plug-ins include Calico, Flannel, and Cilium. ACK provides the following CNI plug-ins: Terway and Flannel. Terway and Flannel provide different features. The following section describes the features provided by Terway and Flannel. You can refer to the Comparison between Terway and Flannel topic to learn how to select between Terway and Flannel when you create an ACK cluster.

Terway: Terway is a CNI plug-in developed by Alibaba Cloud. Elastic Compute Service (ECS) instances in ACK clusters use elastic network interfaces (ENIs) to enable network communication. Terway assigns ENIs on nodes to pods to establish network connections between pods. Therefore, pods in a cluster that uses the Terway plug-in are connected to the virtual private cloud (VPC) in which the cluster resides. The Terway mode improves communication efficiency because no tunneling technologies, such as Virtual Extensible Local Area Network (VXLAN), are required to encapsulate packets. Terway is suitable for large clusters that have high requirements on network performance and access control.

Flannel: Flannel is an open source CNI plug-in. Flannel uses network virtualization technologies such as VXLAN to build an overlay network for pods. Flannel is easy to configure and use. Compared with Terway, Flannel comes with weaker network performance and access control capabilities because Flannel does not require NAT gateways. In addition, Flannel does not support large clusters. Flannel is suitable for clusters that contain no more than 1,000 nodes. If you have low requirements on network performance and access control, or you want to quickly create and use clusters, we recommend that you use the Flannel plug-in.

Network Load Balancer (NLB): NLB is a Layer 4 load balancing service intended for the Internet of Everything (IoE) era. NLB offers ultra-high performance and can automatically scale on demand. An NLB instance supports up to 100 million concurrent connections, which is ideal for services that require high concurrency.

Application Load Balancer (ALB): ALB is an Alibaba Cloud service that runs at the application layer and is optimized to balance traffic over HTTP, HTTPS, and Quick UDP Internet Connections (QUIC). ALB is highly elastic and can process large volumes of Layer 7 traffic on demand. ALB supports complex routing. ALB is deeply integrated with other cloud-native services and is designed to serve as a cloud-native Ingress gateway of Alibaba Cloud.

Elastic IP Address (EIP): An EIP is a public IP address that you can purchase and hold as an independent resource. You can associate EIPs with ECS instances in a virtual private cloud (VPC), secondary ENIs, Server Load Balancer (SLB) instances, NAT gateways, and high-availability virtual IP addresses (HAVIPs).

Design principles

When you design Kubernetes Ingresses and Services, follow these design principles to ensure system high availability, scalability, performance, observability, and security: First, choose multi-region or multi-zone deployment to implement high availability, and enable health checks and automatic failover to maintain business continuity. Second, enable dynamic scaling and configure routing policies to optimizing scaling and traffic management. Third, enable intelligent traffic management and peak policies to optimize the performance, and use a full-dimensional monitoring system and a real-time alerting mechanism to improve system observability. Last, use Kubernetes YAML files and automatic deployment tools to configure custom capabilities, and enable access control to enhance network protection. The preceding design principles help you build a stable, efficient, and secure application delivery network for Kubernetes.

High reliability.

  • Multi-region/Multi-zone deployment:

    • Service: Deploy a Service across multiple zones to ensure that the failure of a single zone does not affect the overall availability.

    • Ingress: Configure an Ingress controller to support multi-region or multi-zone deployment. Service continuity and high availability can be maintained even if one of the zones fails.

  • Health checks and failover:

    • Service: Use the built-in health check feature of Kubernetes to monitor pod status and automatically remove unhealthy pods.

    • Ingress: Use the health check feature of Ingress controllers to periodically check the health status of backend services and automatically switch traffic from unhealthy node to healthy nodes.

Scalability

  • Fine-grained horizontal scaling:

    • Service: Services support fine-grained horizontal scaling by zone.

    • Ingress: Configure complex routing policies for Ingresses to distribute traffic based on business requirements, such as paths and hostnames.

Performance and elasticity

  • Dynamic traffic management and automatic scaling:

    • Services and Ingresses: Integrate Services and Ingresses with intelligent traffic management policies, such as session management and weighted round-robin, to optimize system performance and improve user experience. In addition, configure automatic scaling policies for Ingress controllers and services to efficiently cope with traffic fluctuations in different time ranges.

  • Performance during peak hours:

    • Services and Ingresses: Integrate Services and Ingresses with the automatic scaling feature to dynamically adjust the capacity based on real-time traffic volume. This ensures service performance during peak hours and reduces costs during off-peak hours.

Observability

  • Metric collection and display:

    • Services and Ingresses: Collect and display performance metrics, which show the comprehensive service status.

  • Anomaly detection and alerting:

    • Services and Ingresses: The automatic anomaly detection and alerting system can quickly respond to potential issues and minimize the service downtime.

Self-service capabilities

  • Native compatibility and automated deployment:

    • Services and Ingresses: Compatible with Kubernetes YAML definitions, management services, and Ingresses. You can use infrastructure as code (IaC) tools such as Terraform and Resource Orchestration Service (ROS) to automate deployment and increase O&M efficiency.

Security

  • Access control and encryption:

    • Services and Ingresses: You can use security groups to By default, TLS termination is enabled to protect data transmission and prevent man-in-the-middle attack (MITM) attacks.

Key design

Use NLB as the LoadBalancer Service and ALB as the Ingress.

High reliability

  • Multi-region/Multi-zone deployment:

    • Service: NLB supports disaster recovery at multiple levels. Network traffic is distributed across groups of backend servers to enable disaster recovery. NLB also supports session persistence and multi-zone deployment to ensure service availability.

    • Ingress: An ALB instance can be deployed in at least two zones. Fault isolation can be enabled between zones. If one zone is down, other zones in the region are not affected.

  • Health checks and failover:

    • Service: NLB uses health checks to test the availability of backend servers. After you enable health checks, if a backend server fails health check, NLB automatically forwards requests that are destined to the backend server to other healthy backend servers. When the backend server is declared healthy again, NLB automatically forwards requests to the backend server. Health checks are a key measure to ensure service high availability. Health checks improve the overall availability of businesses and eliminate single points of failure (SPOFs) caused by an unhealthy server.

    • Ingress: To monitor the availability of ALB backend servers, you can configure health checks for the server groups of the ALB. Health checks ensure service availability by detecting unhealthy backend servers at the earliest opportunity. When health checks are enabled, ALB automatically routes requests to healthy backend servers and probes the availability of all backend servers at a specified interval. A backend server must pass health checks for a specific number of times (N times) before the backend server is declared healthy. You can specify N based on your business requirements. This prevents health check errors caused by network jitter.

Scalability

  • Fine-grained horizontal scaling:

    • Services and Ingresses: NLB and ALB can be manually scaled out. You can enable NLB and ALB to distribute traffic to different zones to improve system reliability and performance.

Performance and elasticity

  • Dynamic traffic management and automatic scaling:

    • Services and Ingresses: NLB and ALB use domain names and virtual IP addresses (VIPs) to distribute traffic to multiple layers to servers. ALB and NLB distribute network traffic across server groups to improve the availability of applications and prevent service interruptions caused by SPOFs. ALB and NLB also support customized multi-zone deployment and elastic scaling across zones to remove resource bottlenecks in individual zones.

Observability

  • Metric collection and display:

    • Services and Ingresses: You can use CloudMonitor to monitor and view the status and metrics of ALB and NLB, which help you quickly identify errors. You can view the monitoring information about ALB and NLB resources by using the console, API, or SDK. Both ALB and NLB support Prometheus monitoring.

  • Anomaly detection and alerting:

    • Services and Ingresses: After you activate CloudMonitor, you can configure alert rules for ALB and NLB instances by using the CloudMonitor console, calling API operations, or using SDKs.

Self-service capabilities

  • Native compatibility and automated deployment:

    • Service: You can use IaC tools such as Terraform and ROS to automate NLB deployment. You can also use annotations to configure NLB as a Service.

      apiVersion: v1
      kind: Service
      metadata:
        annotations:
          service.beta.kubernetes.io/alibaba-cloud-loadbalancer-zone-maps: "${zone-A}:${vsw-A},${zone-B}:${vsw-B}" # Example: cn-hangzhou-k:vsw-i123456,cn-hangzhou-j:vsw-j654321. 
        name: nginx
        namespace: default
      spec:
        externalTrafficPolicy: Local
        ports:
        - name: tcp
          port: 80
          protocol: TCP
          targetPort: 80
        - name: https
          port: 443
          protocol: TCP
          targetPort: 443
        selector:
          app: nginx
        loadBalancerClass: "alibabacloud.com/nlb"
        type: LoadBalancer
    • Ingress: You can use IaC tools such as Terraform and ROS to automate ALB deployment. You can deploy an ALB Ingress in a Kubernetes cluster to monitor changes in AlbConfigs, Ingresses, and Services on the API server and then dynamically update the existing AlbConfigs. Kubernetes clusters are compatible with NGINX Ingresses.

      image

Security

  • Access control and encryption:

    • Services and Ingresses: By default, ALB and NLB can be added to security groups and support access control based on protocols, ports, and IP addresses.

    • Ingress: ALB supports various protocols, including HTTP, HTTPS, WebSocket, and TLS.

Best practices

Service design

image

Scenario

Use pods in ACK clusters as web servers: Deploy applications in the pods of ACK clusters and expose Services by using IP addresses and ports.

Select Service types

A LoadBalancer Service is similar to a NodePort Service configured with a load balancer. This type of Service can evenly distribute traffic to multiple pods. A LoadBalancer Service automatically provides a public IP address to expose the backend pods to external access. LoadBalancer Services can process TCP and UDP requests at Layer 4 and manage HTTP and HTTPS requests at Layer 7.

Create or associate with NLB

  • You can select an existing NLB instance or create an NLB instance, and use the NLB instance as a LoadBalancer Service.

  • You can create or associate with NLB instances by using the ACK console or kubectl (YAML scripts), and configure service association (similar to adding server groups to NLB) and port mapping (similar to adding listeners to NLB).

  • You can add annotations to the YAML file of a Service to configure NLB.

NLB quotas

For more information, see Limits.

Ingress design

ALB Ingresses

image
Scenario

The ALB Ingress serves as a unified ingress for Services in the ACK cluster. Compared with NGINX Ingresses, ALB Ingresses are fully hosted. You do not need to manually maintain ALB Ingresses. ALB Ingresses can automatically detect changes in Ingress resources in Kubernetes clusters and then distribute traffic to backend Services based on the predefined routing rules. In addition, ALB Ingresses adopt a powerful auto scaling mechanism to automatically adapt to fluctuating traffic to ensure system stability.

How it works

Terms related to ALB Ingresses:

  • ALB Ingress controller: a component that manages Ingress resources. The ALB Ingress controller uses the API server to dynamically obtain changes in Ingress and AlbConfig resources and then updates the ALB instance. Unlike the NGINX Ingress controller, the ALB Ingress controller is a control plane of the ALB instance. It manages the ALB instance but does not distribute traffic. Traffic is distributed by the ALB instance. The ALB Ingress controller uses the API server in the cluster to dynamically obtain changes in Ingress resources and updates the ALB instance based on the Ingress routing rules.

  • AlbConfig: An AlbConfig is a CustomResourceDefinition (CRD) created by the ALB Ingress controller. The parameters in the AlbConfig define the configuration of the ALB instance. Each AlbConfig corresponds to one ALB instance. The ALB instance serves as an ingress to distribute traffic to backend Services. ALB Ingresses are fully hosted by ALB. Compared with the NGINX Ingress controller, ALB Ingresses are O&M-free and extremely elastic.

  • IngressClass: An IngressClass defines the association between an Ingress and an AlbConfig.

  • Ingress: Ingresses are resource objects that define external traffic routing rules and access control rules. The ALB Ingress controller monitors changes in Ingress resources and updates the ALB instance to distribute traffic.

  • Service: In Kubernetes, pod are varying temporary resources. A Service is a unified ingress to pods that serve the same feature. Other applications or Services can communicate the pods through the virtual IP address and port of the Service without worrying about any changes in the pods. For more information about Services, see Service management.

image
ALB quotas

For more information, see Methods to calculate ALB quotas.

NLB+Nginx Ingress

image
Scenario

Use open source NGINX as the Ingress controller, and deploy it in a pod of the Kubernetes cluster. The frontend uses NLB to distribute traffic to pods. In this architecture, NLB is combined with NGINX to build an efficient and reliable ingress system for Layer 7 traffic.

NLB quotas

For more information, see Limits.

Comparison between ALB Ingresses and NGINX Ingresses

Dimension

Nginx Ingress

ALB Ingress

Positioning

  • A self-managed component that require self-managed O&M.

  • You can customize NGINX Ingresses based on your business requirements.

  • A fully managed component on Alibaba Cloud, which supports ultra-high capacity, automatic scaling, high reliability, and automatic O&M.

  • Supports various features and deep integration with multiple cloud services.

  • Can be used as a LoadBalancer Service that reduces load balancing costs and cluster costs because pods are not occupied by ALB Ingresses.

Performance

  • Manual configuration is required to fine-tune the system and NGINX parameters.

  • The number of replicated pods and resource limits must be configured properly.

  • Supports one million queries per second (QPS) per instance.

  • Supports tens of millions of connections per instance.

  • SSL hardware acceleration is used by default.

Configuration

  • Process reloading is required when you update certificates. This affects persistent connections.

  • Non-certificate updates are completed by performing Lua hot updates. The Lua plug-in needs to reload processes during updates.

  • Supports dynamic configurations by calling API operations, which greatly improve the timeliness.

  • Supports rolling dates without reloading, which ensures lossless data forwarding over persistent connections.

Feature

  • Supports the HTTP and HTTPS protocols.

  • Supports routing based on domain names, URLs, or HTTP headers, canary releases, and blue-green deployment.

  • Supports the HTTP, HTTPS, QUIC, WebSocket, WebSocket Secure (WSS), and gRPC protocols.

  • Supports various forwarding conditions and actions in combination with headers, cookies, and weights, route priorities, bidirectional forwarding rules, canary releases, and blue-green deployment.

Security

  • Supports the HTTPS protocol.

  • Supports blacklists and whitelists.

  • Supports end-to-end HTTPS, multiple Server Name Indication (SNI) certificates, Rivest-Shamir-Adleman (RSA)/Elliptic-curve cryptography (ECC) certificates, TLS 1.3, and TLS cipher suites.

  • Integrated with DDoS protection by default. WAF protection can be enabled with one click. Access control list (ACL) blacklists and whitelists are supported.

  • The control container is isolated from the forwarding container to prevent security risks caused by shared containers.

O&M

  • Managed and maintained by customers.

  • Manual configuration is required to select instance types and fine-tune parameters.

  • Allows you to configure HPA to scale resources.

  • Fully managed and maintenance-free with a high service-level agreement (SLA).

  • Saves you the need to select instance types and provides an ultrahigh capacity.

  • Supports automatic scaling based on workload fluctuations.

Scenarios

ALB Ingress scenarios

image
  • Canary releases, blue-green deployment, and A/B testing: When you deploy a new version, you can perform a canary release, blue-green deployment, or A/B testing to reduce risks and ensure user experience. Such strategies accurately migrate traffic from the existing version to the new version based on HTTP headers, cookies, or weights. For example, you can migrate a specific percentage of requests to the new version and observe the performance in the production environment. This ensures smooth experience for most users.

  • IoT: With the development of IoT technologies, a drastically growing number of devices are connected to the IoT, from home automation devices to industrial automation systems. Each application may need to manage millions of HTTP, HTTPS, and HTTP/2 connections. In such cases, the backend architecture must be equipped with an ultra-high capability to process concurrent connections, efficiently manage and maintain a large number of clients, and ensure timely and accurate data transmission for each device.

  • Gaming and finance: For gaming and finance services, even a latency of one millisecond can cause great losses. Therefore, gaming and finance services highly value the response efficiency and stability of API requests. Alibaba Cloud provides hardware acceleration technologies and supports a high uptime guaranteed by the SLA to greatly reduce the response time and improve system reliability and security. Such optimization is business-critical for ensuring user experience and improving security and efficiency of financial transactions.

  • Telecommuting: With the increasing requirements for telecommuting, WebSocket and WSS are becoming more and more important. WebSocket and WSS allow servers to maintain persistent connections to clients, which support real-time bidirectional communication. To adapt to the changing work environment, systems are designed to support rolling updates, which do not require reloading during configuration updates. This ensures seamless switching between persistent connections and continuous data transmission, which provide smoother user experience.