Introduction: With the rapid development and extensive application of container technologies, cloud-native technologies are becoming the future of IT development. As the first Chinese company that deployed container technologies, Alibaba Cloud has made great achievements in both technologies and products. Yili, a senior technical expert at Alibaba Cloud, shares the best practices of container technology implementation through Alibaba Cloud Container Service. This article is meant to help you better understand the container technologies and cloud-native concepts, properly design cloud architecture, and make full use of the cloud.
A quote taken The Economist magazine said, "Without the container, there would be no globalization."
Economic globalization is based on the modern transportation system and at its core are containers. The emergence of the shipping container enabled the standardization and automation of logistics, greatly reducing transportation costs and making it possible to integrate global supply chains. Therefore, without containers, there would be no globalization.
The standardization and modularization concepts of containers are promoting the supply chain reform in the construction industry. After the coronavirus (COVID-19) pandemic, Huoshenshan Hospital, a specialized hospital that can accommodate thousands of beds was built within 10 days in Wuhan, China. It assumed an important role during the fight against the pandemic, especially in the early days. The whole hospital was assembled from prefabricated container houses. The modular rooms were pre-equipped with air conditioners, disinfection stations, water supplies, and drainage, greatly accelerating the speed of construction of the hospital.
The software container technology is also reshaping the entire software supply chain. As a lightweight virtualization technology for operating systems, containers are different from traditional physical machines and virtualization technologies. Think about it like this:
A traditional physical machine is like a single-family detached home.
Virtual machines are like townhouses.
A container is like a container house.
In the past few years, container technologies have been widely used in the IT industry. Their most important values are:
Speed matters a lot. In the era of digital transformation, each enterprise is facing the impact of emerging business modes and numerous uncertainties. An enterprise's continuous innovation ability, rather than its current large scale or past successful strategies, determines its success in the future. Container technologies improve the IT architecture agility of an enterprise, thereby enhancing its business agility and accelerating its business innovation. For example, during the COVID-19 pandemic, online businesses in the education, video, and public health industries experienced explosive growth. Container technologies help seize opportunities for rapid business growth. According to industrial statistics, container technologies increase delivery efficiency 3 to 10 times over, which allows enterprises to carry out fast iteration and low-cost trial and error.
In the Internet age, enterprise IT systems often encounter both predictable and unexpected traffic growth, such as e-commerce promotions and emergencies. Container technologies can give full play to the elasticity of cloud computing and reduce the computing cost by increasing deployment density and elasticity. For example, after the exponential growth of online traffic during the COVID-19 pandemic, container technologies can be used to alleviate the expansion pressure for online education, supporting online teaching for hundreds of thousands of teachers and online learning for millions of students.
Container technologies have promoted the standardization of cloud computing. Containers have become the standard for application distribution and delivery and can decouple applications from the underlying runtime environment. Kubernetes has become the standard for resource scheduling and orchestration. It shields differences of underlying architectures and allows applications to run smoothly on different infrastructures. The Cloud-Native Computing Foundation (CNCF) provides Certified Kubernetes Conformance Programs to ensure compatibility with different Kubernetes implementations. By using container technologies, it will be easier to build application infrastructures in the age of the cloud.
Kubernetes: Infrastructure in the Cloud-Native Era
Kubernetes has become a cloud application operating system. More and more applications are running on Kubernetes, such as stateless web applications, transactional applications (databases and message-oriented middleware), and data-based intelligent applications. The Alibaba economy also implements comprehensive cloud-native migration to the cloud base on container technologies.
Alibaba Cloud Container Service products provide an enterprise container platform within Alibaba Cloud, edge computing, and Apsara Stack environments. The core of Alibaba Cloud Container Service products is Alibaba Cloud Container Service for Kubernetes (ACK) and Serverless Kubernetes (ASK.) They are built on a foundation of a series of Alibaba Cloud infrastructure capabilities, such as computing, storage, networking, and security. In addition, they provide standardized APIs, optimized capabilities, and enhanced user experience. ACK is certified by the Certified Kubernetes Conformance Program and provides a series of core capabilities required by enterprises, such as security governance, end-to-end observability, multi-cloud, and hybrid cloud.
Alibaba Cloud Container Registry (ACR) is the core of asset management for enterprise cloud-native applications. It can manage application assets, such as Docker images and Helm charts, and can be integrated with continuous integration and continuous delivery (CI/CD) tools for a complete DevSecOps process.
Alibaba Cloud Service Mesh (ASM) is a platform for fully managing the traffic of microservice-oriented applications. It is compatible with Istio, supports unified traffic management of multiple Kubernetes clusters, and provides consistent communication, security, and observability for application services in containers and virtual machines.
This section describes the topology of a managed Kubernetes cluster. The Kubernetes cluster managed by ACK is based on the Kubernetes architecture. Master nodes of the Kubernetes cluster run on the control plane (a Kubernetes cluster) of the Virtual Private Cloud (VPC) network.
ACK adopts the default high-availability architecture design, where three etcd replicas run in three different zones, respectively. Two etcds are also provided according to scalability best practices. One etcd stores configuration information and the other stores system events. This improves the availability and scalability of etcds. Master nodes of the Kubernetes cluster, such as API Server and Scheduler, are deployed with multiple replicas and run in two different zones. Master nodes can be elastically expanded based on the workload, and worker nodes access the API Server through the Server Load Balancer (SLB.) This design ensures that the Kubernetes cluster runs properly, even if a zone becomes faulty.
Worker nodes run on the VPC network. You can run the nodes in different zones and use the zone-based anti-affinity feature of the application to ensure the high availability of the application.
Elasticity is a core capability of the cloud. Only the robust elastic computing power provided by the cloud can support the typical traffic pulse scenarios, such as the Double 11 Global Shopping Festival and the rapid growth of traffic for online education and collaborative office work after the COVID-19 pandemic. Kubernetes can maximize the elasticity of the cloud.
ACK provides various elasticity policies at the resource layer and application layer. The current mainstream solution at the resource layer is to scale nodes in or out by using cluster-autoscaler (CA.) When a pod fails to be scheduled due to insufficient resources, CA automatically creates nodes in the node pool based on the application workload.
Elastic Container Instance (ECI) provides a serverless container runtime environment based on lightweight virtual machines. You can schedule and run applications on instance groups in ACK. This is suitable for offline big data tasks, CI/CD jobs, and burst business scaling. On the Weibo app, 500 ECI pods can be scaled out in 30 seconds to easily respond to burst events.
At the application layer, Kubernetes provides Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA). Alibaba Cloud provides metrics-adapters to support more elasticity metrics. For example, you can adjust the number of pods for an application based on the queries per second (QPS) of the ingress. In addition, the resource profiles of many application workloads are periodic. For example, the business peak of the securities industry is the opening time of the stock market on weekdays. The resources required for the peak are 20 times those for the valley. To solve this problem, Alibaba Cloud Container Service provides a scheduled scaling component so developers can define a scheduled scaling policy to scale out resources in advance and reclaim resources regularly at the valley. This can balance the stability and resource costs of the system well.
Kubernetes provides powerful functions and flexibility, but it is extremely challenging to operate and maintain a Kubernetes production cluster. Even if a managed Kubernetes service is used, you need to retain the worker node resource pool and perform routine maintenance on the worker nodes, such as upgrading the operating system and installing security patches. You also need to plan the capacity at the resource layer based on your resource usage.
To address the complex O&M of Kubernetes clusters, Alibaba Cloud launched ASK. Compatible with Kubernetes applications, ASK enables Kubernetes O&M to be done on cloud infrastructures. This allows developers to focus on the applications.
For serverless containers, we provide two technical solutions: ACK on ECI and ASK.
ACK clusters are functional and flexible, which meets the demands of large Internet enterprises and traditional enterprises. You can run different applications and jobs in an ACK cluster. ACK clusters are intended for Site Reliability Engineering (SRE) teams in enterprises, allowing them to perform customized development and flexible control for Kubernetes.
ACK clusters support the following container runtime technologies.
ECI applies to Kubernetes clusters in the following scenarios:
ASK is a customized container for independent software vendors (ISVs), departments of large enterprises, and small- and medium-sized enterprises. You can create and deploy Kubernetes applications without the Kubernetes management and O&M capabilities, which greatly simplifies the management and is suitable for scenarios such as application hosting, CI/CD, AI, and data computing. For example, you can use the ASK and Graphics Processing Unit (GPU) instance groups to build an O&M-free AI platform. You can also create a machine learning environment on-demand. In either case, the overall architecture is very simple and efficient.
The cloud-native distributed application architecture has the following features: high availability, auto scaling, fault tolerance, easy management, high observability, standardization, and portability. We can build a cloud-native application reference architecture on Alibaba Cloud that includes:
End-to-End Elastic Application Architecture: You can containerize frontend applications and business logic, deploy them in a Kubernetes cluster, and configure HPA based on the application load.
At the backend data layer, you can use cloud-native databases such as Apsara PolarDB. Apsara PolarDB uses storage-computing separation architecture and supports scale-out. With the same specification, the performance of Apsara PolarDB is seven times that of the MySQL database, while the cost is half of the MySQL database.
Systematic High-Availability Design:
This ensures the zone-based availability of the entire system and can tolerate one failed zone.
Application High Availability Service (AHAS) provides the architecture awareness capability and can visualize the system topology. Moreover, AHAS provides the application inspection capability to detect availability issues, for example, whether the number of application replicas meets the availability requirements, and whether multi-zone disaster recovery is enabled for ApsaraDB for RDS (RDS) instances.
In a large-scale distributed system, various stability or performance problems may occur in infrastructures (networks, computing nodes, and operating systems) or applications. Observability helps you understand the status of the distributed system and make decisions accordingly. It also serves as the basis for auto scaling and automated O&M.
In general, observability consists of several important aspects:
Logging (Event Streams)
We provide a complete log solution based on Log Service (SLS) to collect and process application logs and provide capabilities such as ActionTrail and Kubernetes event centers.
Observability provides comprehensive monitoring of infrastructure services, such as ECS, storage, networking, and CloudMonitor. For business application performance metrics, such as the heap memory usage of Java applications, Application Real-Time Monitoring Service (ARMS) provides comprehensive performance monitoring for Java and PHP applications without modifying business code. For Kubernetes applications and components, ARMS provides managed Prometheus services, various OOTB preset monitoring dashboards, and APIs to facilitate third-party integration.
Tracing Analysis provides developers with comprehensive tools for distributed application trace statistics and topology analysis. It can help developers quickly locate and troubleshoot performance bottlenecks in distributed applications and improve the performance and stability of microservice-oriented applications.
From DevOps to DevSecOps
Security is an enterprises' biggest concern about container technologies. To systematically improve the security of container platforms, we need to perform comprehensive security protection. First, we need to upgrade DevOps to DevSecOps, emphasizing the need to integrate security concepts into the entire software lifecycle and perform security protection in the development and delivery phases.
ACR Enterprise Edition provides a complete security software delivery chain. After you upload images, ACR can automatically scan the images to detect common vulnerabilities and exposures (CVEs.) You can then use the Key Management Service (KMS) to automatically add digital signatures to the images. You can configure automated security policies in ACK. For example, only the images that have been scanned and meet the launch requirements in the production environment can be released. This way, the entire software delivery process is observable, traceable, and policy-driven. This ensures security and improves delivery efficiency.
During runtime, applications face many risks, such as CVEs and virus attacks. Alibaba Cloud Security Center provides security monitoring and protection for applications during runtime.
Alibaba Cloud Security Center can monitor container application processes and networks, and detect application exceptions and vulnerabilities in real-time. When Alibaba Cloud Security Center detects a problem, it notifies you by email or SMS and automatically isolates and rectifies the problem. For example, a mining worm virus can exploit your configuration errors to launch attacks on container clusters. In this case, Alibaba Cloud Security Center can help you easily locate and clear the virus.
In February 2020, we released the first fully managed and Istio-compatible ASM in the industry. The control plane components of ASM are managed by Alibaba Cloud and independent of user clusters on the data plane. The hosting mode greatly simplifies the deployment and management of the Istio service mesh and decouples the lifecycle of the service mesh from the Kubernetes clusters. This makes the architecture simpler and more flexible and the system more stable and scalable. ASM integrates the Alibaba Cloud observability service and SLS based on Istio, which helps you manage applications in the service mesh more efficiently.
On the data plane, ASM supports various computing environments, including ACK Kubernetes clusters, ASK clusters, and ECS virtual machines. Cloud Enterprise Network (CEN) and ASM can implement service mesh between Kubernetes clusters across multiple regions and VPC networks. This enables ASM to implement traffic management and phased release for large-scale distributed applications in multiple regions. ASM will soon support multi-cloud and hybrid clouds.
Cloud migration has become inevitable. However, due to business data sovereignty and the security privacy of some businesses, enterprises can use the hybrid cloud architecture but cannot directly migrate their businesses to the cloud. Gartner predicts that 81% of enterprises will adopt multi-cloud or hybrid clouds. Hybrid cloud architecture has become a new norm for an enterprises' cloud migration.
Traditional hybrid cloud architecture is designed to abstract and manage cloud resources. However, differences in infrastructures and security architecture capabilities between different cloud environments can separate an enterprise's IT architecture from its O&M system. This makes hybrid cloud implementation more complex and increases O&M costs.
In the cloud-native era, technologies, such as Kubernetes, shields infrastructure differences for better centralized resource scheduling and application lifecycle management in hybrid cloud environments. Application-centric hybrid cloud architecture 2.0 is now available.
The following lists several typical scenarios:
Based on ACK, the hybrid cloud network, storage gateway, and database replication capabilities of Alibaba Cloud, we can help enterprises build a new hybrid cloud IT architecture.
Hybrid Cloud Architecture 2.0
ACK provides a centralized cluster management capability. In addition to Alibaba Cloud Kubernetes clusters, ACK can also manage your Kubernetes clusters in the on-premises Internet data center (IDC) and on other clouds. The centralized control plane enables unified security governance, observability, application management, backup, and recovery for multiple clusters. For example, SLS and managed Prometheus services can provide you with a unified observability dashboard for off-premises and on-premises clusters without code invasion. Security Center enables AHAS to help you detect and rectify security and stability risks in the hybrid cloud architecture.
ASM provides a unified service governance capability, which enables access to the nearest service, failover, and phased release with the multi-region and hybrid cloud network capabilities provided by CEN and Smart Access Gateway (SAG.) This compound solution can be used in scenarios, such as cloud disaster recovery and active geo-redundancy, to improve business continuity.
Cloud-Native Hybrid Cloud Solution
UniCareer is an e-learning career development platform that serves users in many regions around the world. Its applications are deployed in multiple Kubernetes clusters in four regions of Alibaba Cloud. In these clusters, CEN is used to connect multiple cross-region VPC networks. An ASM instance is used to manage the traffic of microservice-oriented applications in multiple Kubernetes clusters.
Service routing policies are centrally managed by the ASM control plane and delivered to multiple Kubernetes clusters. User requests are distributed to the ingress gateway in the nearest region through Domain Name System (DNS.) Then, the service endpoints are accessed in this region first through ASM. If services in this region are unavailable, the requests are automatically routed to other regions for traffic switching.
Cloud-Native Hybrid Cloud Management
The hybrid cloud solution of Alibaba Cloud has the following features:
Hitless Migration of Windows Containers to the Cloud
Now, let's talk about support for Windows containers. As of 2020, the Windows operating system still dominates the market, with a market share of 60%. Enterprises use a large number of Windows apps, such as Enterprise Resource Planning (ERP), Customer Relationship Management (CRM), and ASP.Net. Windows containers and Kubernetes enable you to implement containerized delivery without rewriting the code of .Net applications. This maximizes the elasticity and agility of the cloud to achieve fast iteration and scaling of business applications.
ACK supports Windows 2019 in Kubernetes container clusters:
1) Provide a consistent user experience and unified capabilities for Linux and Windows applications.
2) Support hybrid deployment and interconnection of Linux and Windows applications in a cluster. For example, PHP applications running on Linux nodes can access the SQL Server database running on Windows nodes.
The following briefly introduces the cloud-native marketing strategy of Alibaba Cloud.
New cornerstone: Container technology allows users to use cloud resources. Cloud-native technology helps quickly deliver the value of the cloud.
New computing power: The innovation of the cloud-native-based software and hardware integration technology improves computing efficiency and accelerates intelligent business upgrades.
New ecosystem: We will provide the technology ecosystem and the Global Partner Program to enable more enterprises to enjoy the benefits of Alibaba's technologies in the age of the cloud.
Get to know our core technologies and latest product updates from Alibaba's top senior experts on our Tech Show series
Alibaba Developer - November 5, 2020
Alibaba Clouder - January 4, 2021
Alipay Technology - May 14, 2020
Alibaba Container Service - April 16, 2021
Alibaba Developer - February 9, 2021
Alibaba Developer - February 3, 2020
Alibaba Cloud Container Service for Kubernetes is a fully managed cloud container management service that supports native Kubernetes and integrates with other Alibaba Cloud products.Learn More
Accelerate software development and delivery by integrating DevOps with the cloudLearn More
Accelerate and secure the development, deployment, and management of containerized applications cost-effectively.Learn More
A secure image hosting platform providing containerized image lifecycle managementLearn More
More Posts by Alibaba Clouder