What is cloud-native? Everyone has his or her own interpretation of this term. Drawing on extensive discussions and practical experience from various projects, this article presents the interpretation of cloud-native technologies Alibaba's delivery experts. It discusses how to build cloud-native applications, key cloud-native technologies, and ideas about cloud-native implementation.
The Internet has changed the way people live, work, study, and entertain themselves. The rapid development of technologies has driven the evolution of the cloud computing market from the early physical machines to virtual machines (Bare Metal Instance) and then to containers, while the Internet architecture evolved from centralized architectures to distributed architectures, and then to cloud-native architectures. Nowadays, the term "cloud-native" has been elevated by enterprises and developers to the status of an industry standard and the future of cloud computing. If I were to describe cloud-native technologies in one sentence, it would be "the future is here, but not evenly distributed."
Cloud-native technologies (architectures) have seen a sharp increase in popularity, but the concept is still interpreted differently by different people, despite the wide-ranging articles and discussions on this topic in the online community and inside Alibaba. In my opinion, we are exploring what it means to be cloud native and trying to understand and put cloud-native technologies into practice. Therefore, there is still no clear or overarching standard definition.
A cloud migration project I took part in recently involved many cloud-native technologies. I would like to take this opportunity to share my insights while drawing on the discussions and practical experience from the project.
Before getting into this topic, let's see how industry influencers define "cloud native," namely Pivotal Software and CNCF.
Pivotal Software is a leader in the field of agile development (previously contracted with Google) and has an impressive pedigree (it was founded by EMC and VMware). It launched Pivotal Cloud Foundry (a big hit in the field of PaaS between 2011 and 2013) and the Spring Framework and is a pioneer in cloud-native technologies. The following figure shows how Pivotal defines "cloud native":
Matt Stine at Pivotal Software first proposed the concept of cloud-native in 2013. In 2015, in his book "Migrating to Cloud-Native Application Architectures", Matt Stine defined the key characteristics of cloud-native application architectures, including twelve-factor application, microservice architecture, self-service agile infrastructure, API-based collaboration, and antifragility. Matt Stine revised his definition in 2017 and indicated six characteristics of cloud-native architecture: modularity, observability, deployability, testability, handleability, and replaceability. The latest piece published on Pivotal Software's official website characterizes cloud-native applications and services as an integration of the concepts of DevOps, continuous delivery, microservices, and containers.
Cloud Native Computing Foundation (CNCF) is a well-known organization in the industry. It is a foundation co-sponsored by leading open-source infrastructure companies such as Google and RedHat. The mission of CNCF was to compete in the container market dominated by the then-prominent platform Docker. Through the Kubernetes project, CNCF has maintained undisputed leadership in the field of orchestration in the open-source community and is the champion in defining and promoting cloud-native architectures. Here is how CNCF defines "cloud-native":
In 2015, CNCF originally defined three characteristics of cloud-native architectures: containerized encapsulation, automated management, and microservices. In 2018, CNCF updated its definition of cloud-native architectures to include two new features, declarative API and service mesh (a new technology that emerged in the open-source community in 2017; it is a parallel technology to microservices). These technologies are used to build loosely coupled systems that are highly fault-tolerant and easy to manage and observe.
As the community continues to grow the ecosystem and push the boundary of cloud-native architectures, the definition of cloud native is constantly changing. Companies (like Pivotal and CNCF) define this concept differently, and one company may use different definitions at different times. Following Moore's Law, we can expect the definition of cloud native to continue to shift in the future.
As for the two different definitions given by Pivotal and CNCF, I believe the distinction is caused by the respective organizational structures and perspectives adopted by the two industry influencers:
Pivotal Software is a pioneer in the concepts and methodologies of cloud-native architectures, while CNCF contributes to best practices.
However, it seems Pivotal Software advocates the concept of container technology, while CNCF implements its technology through the microservices content. So are they really all that different? We welcome you to tell us your opinion in the comment section below.
From the birth of the Internet to the present, we have adopted Internet thinking and then Internet+ thinking (which is essentially Internet native). When enterprises reach a certain stage, they need to develop value thinking (or, value-native thinking). Therefore, it is necessary for cloud computing practitioners to develop cloud-native thinking. Abstract paradigms always preceded tangible solutions in any technological reform or widespread adoption of new methods.
Drawing on the definitions given by Pivotal Software and CNCF, I came to the following understanding of what it means to be cloud-native:
Being cloud-native means building an application system that runs on the cloud through both a methodology (such as that from Pivotal Software) and a technical framework (such as that from CNCF). Such an application system breaks away from traditional system building methods and makes full use of the native capabilities of the cloud to maximize its value. It adopts the characteristics of cloud-native architectures in order to rapidly empower businesses.
This abstract interpretation can be broken down into four questions:
The emergence of cloud computing is closely related to the development and maturity of virtualization technology. It is an emerging IT infrastructure delivery method. that relies on virtualization technology to standardize, abstract, and scale IT hardware resources and software components into product-like services that allow users to "pay as they go". In a sense, this reconstructs the IT industry's supply chain. Its models of service delivery include Infrastructure as a Service (IaaS), Platform as a Service (PaaS), Function as a Service (FaaS), and Data as a Service (DaaS):
IaaS indicates the fundamental and underlying capabilities of cloud computing, such as computing, storage, network, and security.
PaaS generally refers to the high-level domain- or scenario-oriented services that are built on top of the underlying cloud capabilities, such as cloud databases, cloud object storage, middleware (including caches, message queues, load balancing, service mesh, and container platforms), and application services.
This is a serverless computing architecture, through which users can run applications without purchasing or concerning themselves with infrastructure and elastically scale services using the pay-as-you-go billing method. This is also an extreme form of evolution from PaaS. Currently, three types of solutions are available under this architecture:
Using data as a service, the architecture extends to upper-layer applications and, when used with AI and cloud services, can deliver various high-value services. These services include big data-based decision making, video and facial recognition, deep learning, and scenario-based semantic understanding, among others. This is also the core strength of the cloud of the future.
As technologies and open-source solutions continue to develop and cloud service providers provide more products and capabilities, every layer of today's technology architecture, from physical machines, virtual machines, and containers to middleware and then to the serverless architecture, has been gradually standardized. The more standardized the layers are, the greater the added value they can contribute. Relatively common technologies that are not directly related to business (such as service mesh) have also been standardized and incorporated into the underlying infrastructure. Every time a layer of the technology architecture becomes standardized, it will eliminate some of the inefficient and tedious tasks. In addition, the application layer provides emerging technologies, such as AI, to help enterprises reduce the costs incurred during the exploration of suitable solutions, speed up the verification and delivery of new technologies, and truly empower the business.
Meanwhile, users can choose the cloud products that best fit their needs just like building with LEGO blocks, using readily-available resources to avoid repetitive work. This greatly improves the efficiency in each stage of software and service development and accelerates the implementation of various applications and architectures. Users who are already on the cloud can realize huge cost-savings by consuming resources as needed and scaling out at any time.
The preceding section discusses the strong capabilities of the cloud. In comparison with traditional applications, new cloud applications, need to be adapted to these capabilities in each stage of the entire application lifecycle. This involved adaption during the design of software architecture, development, construction, deployment, delivery, monitoring, and O&M. I will discuss this process in terms of various issues users must face.
Great architectures come into being after evolving and progressing over time. They are not created all at once. Therefore, it is meaningless to talk about architectural design. The purpose of architectural evolution must be to solve a certain problem. We can address the problems listed below to better understand the design the cloud-native architecture:
Single-microservice applications are adapted to the cloud-native architecture due to their low complexity and comprehensive set of functions for monitoring, governance, deployment, and scheduling supported by the strong underlying system. However, from the perspective of the overall system, the complexity does not decrease. Instead, enterprises must bear the high costs of building a robust underlying system with strong architectural and O&M capabilities.
In addition, the technology stacks and middleware systems used by enterprises to achieve these functions are closed and highly private, making it difficult to meet all business needs (as is the case with Alibaba). Cloud hosting can reduce the overall complexity of such a project. The cloud service provider can take over the complex underlying system and provide attentive services. Projects will eventually evolve into an infrastructure-free design and use YAML or JSON declarative code to orchestrate the underlying infrastructure, middleware, and other resources. In this way, the cloud can meet every need of an application. Eventually, enterprises will embrace an open and standard cloud technology system.
We introduced DevOps to address the problem of the continuous delivery of applications.
DevOps is a concept everyone is familiar with. I see it as a series of values, principles, methods, practices, and tools designed to achieve fast delivery and continuous optimization. Its core advantage is to close the gap between R&D and O&M, expedite the software delivery process, and improve software quality. The chart below shows a DevOps pipeline:
The platforms involved in this process include: GitHub, Travis, Artifactory, Spinnaker, FIAAS, Kubernetes, Prometheus, Datadog, Sumology, and ELK.
The key to truly implementing and practicing DevOps lies in the answers to the following questions:
In essence, DevOps supports O&M services. By introducing a series of automation tools for new technologies and development into O&M, it brings development closer to the production environment and manages the entire development and O&M processes, ensuring freedom and innovation. When monitoring and fault prevention and control tools are used together with function switches, they can help reach achieve a balance between the user experience and fast delivery.
If technology professionals only need to consider business solutions and business code in the future, it would be necessary to quickly integrate the abundant technical products and cloud vendor platforms already available on the market. This would allow technical professionals to focus on finding solutions and connecting business and technology in a bid to satisfy increasingly diversified and complicated business needs. In terms of O&M, the cloud hides the complexity of the infrastructure and shifts to the O&M mid-end and large-scale O&M for toolchain development. This allows practitioners to focus on cost, efficiency, and stability while ensuring the steady progress of application development.
The earliest container, known as Chroot Jail, was developed in 1979. It was re-defined in 2008 as LXC (Linux Container) and combined the resource management of cgroups with the view isolation of namespace to achieve process-level isolation. However, the greatest innovation in container technology is the container image (or Docker container). This container contained the complete environment (the file system of the entire operating system) required to run an application. Additionally, it was consistent, lightweight, portable, and language independent. It allowed users to achieve "build once, run anywhere" (that is, in development, testing, and production environments) and completely standardize building, distribution, and delivery activities. It also supplies the foundation of immutable infrastructure.
Kubernetes is a Linux system for cloud computing and cloud-native architectures.
As Google's open-source container orchestration and scheduling system built on Borg, Kubernetes makes it possible to use container applications in large-scale industrial production environments.
Relying on declarative APIs, scalable programming interfaces (by using CRD and controllers), and an advanced design philosophy, Kubernetes dominated the field of container orchestration (beating out Docker Swarm and Apache Mesos) and has become the de facto standard for container orchestration systems.
The Kubernetes platform frees users from resource management, further standardizes the infrastructure, reduces complexity, and improves resource utilization. In addition, Kubernetes reduces the cost of cross-data center deployment of hybrid clouds, multiple clouds, and edge clouds.
Service mesh aims to decouple the business logic from the non-business logic, allowing developers to focus solely on the business logic. The solution separates a number of client SDKs unrelated to the business logic (such as service discovery, routing, load balancing, and traffic shaping and degradation) from business applications and puts them into a separate proxy (Sidecar) process that is pushed down to the infrastructure middleware mesh (similar to the shift from TDDL to DRDS). With this solution, an application will face fewer risks from changes in the system framework, become more streamlined and lightweight, and enjoy a faster startup speed. This makes it easier to migrate the application to the serverless architecture. The meshes can implement automatic iteration and upgrade based on their own needs. This facilitates global service governance, phased release, and monitoring. In addition, the mesh boundary can be extended to the database mesh, cache mesh, and message mesh. In this way, service communication can be truly standardized by adopting the TCP/IP protocol for inter-service communication.
The infrastructure and its complete life cycle (creation, destruction, scaling, and replacement) are described in code and orchestrated, executed, and managed with appropriate tools, such as terraform, ROS, and CloudFormation. For example, users only need to define the code and then easily create all the basic resources needed by applications (such as Elastic Compute Service (ECS), Virtual Private Cloud (VPC), ApsaraDB for RDS, Server Load Balancer (SLB), and ApsaraDB for Redis), without the need to frequently switch between pages in the console to apply for and purchase resources. With this approach, the infrastructure code is version-controlled, reviewable, testable, and traceable and can be rolled back, maintain consistency, and prevent configuration drift. It is also easy to share, create templates for, and scale the infrastructure code. In addition to improvement in the overall O&M efficiency and quality, IaC allows users to easily see the full picture of the infrastructure.
The entire lifecycle of cloud-based IDE research provides a complete experience that integrates development, debugging, pre-release, production environment, and CI/CD release. The cloud platform also offers a variety of code library templates to improve the compilation speed through distributed computing and intelligently realize code recommendation and optimization, automatic bug scanning, and identification of logical and systematic risks. It is conceivable that the development models of the cloud era, completely different from those of the local development environment, will feature higher development efficiency, faster iteration speed, and better quality control.
As a member of the GTS delivery team that was tasked with empowering enterprises to succeed in their digital transformations, I have been thinking about the ways to help traditional enterprises transform themselves and embrace cloud-native architectures by drawing on the experience of the Internet industry. Here is a roadmap for the implementation of cloud-native architectures.
The Y-axis in this figure indicates business agility. To achieve cloud-native business agility, you need to:
The X-axis in the figure indicates business robustness. To achieve business robustness, you need to:
Cloud-native architectures seem to be extremely appealing, but once you go deep into the stage of implementation, you will find that they are very complicated. The complexity is not only reflected in the wide range of new concepts and technical features, but also in the huge gap between customers' expectations and the value created by cloud-native technologies and the uncertainty about the future. In the future, I will continue to share and discuss my thoughts, experiences, and practices. This is the first of a series of articles. I hope my writings can contribute to the digital transformation of enterprises.
In the cloud era, we require novel thinking and concepts to properly understand application architectures and IT infrastructure in order to correctly answer the question "what does it mean to be cloud-native." The future is undoubtedly cloud-native. Therefore, in addition to tools, enterprises seeking to transform themselves need a complete philosophy that progresses from concepts to methodologies and then to tools. Only in this way can we better embrace the arrival of the cloud era and maximize the value of cloud-native architectures.
This is the best of times for developers. This is the best of times for cloud vendors. In addition, this is the best of times for professionals specializing in cloud service delivery.
The future is here, but not evenly distributed. Let work together to understand, embrace, and deliver cloud native.
Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.
ApsaraDB - July 29, 2022
Apache Flink Community China - June 2, 2022
Alibaba Clouder - July 14, 2021
Alibaba Developer - June 12, 2020
Alibaba Clouder - March 2, 2021
Alibaba Clouder - February 2, 2021
MSE provides a fully managed registration and configuration center, and gateway and microservices governance capabilities.Learn More
Visualization, O&M-free orchestration, and Coordination of Stateful Application ScenariosLearn More
Serverless Application Engine (SAE) is the world's first application-oriented serverless PaaS, providing a cost-effective and highly efficient one-stop application hosting solution.Learn More
Provides a control plane to allow users to manage Kubernetes clusters that run based on different infrastructure resourcesLearn More
More Posts by Alibaba Clouder