How to build mobile DevOps? -Alibaba Cloud Developer Community

From: Alibaba technology 2020-11-27 1400

introduction: devOps, an excellent software delivery concept, has been practiced on the server. Can it be applied to mobile delivery? What are the differences and challenges between mobile DevOps and server DevOps based on the differences between mobile and server scenarios? This topic describes the thoughts, challenges, and solutions of the Alibaba cloud-native application R & D platform EMAS in the process of building a cloud-native Mobile DevOps, and decrypts its design architecture and technical details.
+ Follow to continue viewing

author | Zhou Mu Source | Alibaba technology public account

introduction to Mobile DevOps

1 What is mobile DevOps

the well-known DevOps

at the time point of 2020, DevOps is no longer a new concept. I believe everyone has their own understanding more or less. However, when we need to describe exactly what DevOps is, it seems difficult to explain clearly. In fact, there is still no definition in the industry that can be recognized by everyone. The reason why it is difficult to be defined accurately is that DevOps is actually a collection of ideas or even a set of ideas, it is difficult to be visualized. The word "DevOps" can be literally understood as the whole life cycle of software from Dev(Development, Development) to Ops(Operations, operation), but what exactly is the definition of DevOps? Among many definitions of DevOps, I personally think the definition of Azure DevOps [1] is more accurate and specific:

DevOps is a compound word for development (Dev) and Operation (Ops). It combines people, processes, and technologies to continuously provide value to customers. What does DevOps mean to the team? DevOps enables isolated roles (development, IT operations, quality engineering, and security) to coordinate and collaborate to produce better and more reliable products. By adopting the DevOps culture, practices, and tools, the team can better respond to customer needs, enhance confidence in the built applications, and achieve business goals faster.

There are several key information in this definition to summarize:

  • combination of people, processes, and technologies
  • DevOps allows isolated roles to coordinate and collaborate
  • DevOps is a concept that not only establishes culture, but also supports automation tools.
  • The aim is to produce better and more reliable products faster.
From DevOps to mobile DevOps

what we often discuss about DevOps is actually server-side DevOps. Since DevOps is an excellent software delivery concept, why not apply DevOps to mobile delivery? This is the mobile DevOps we are going to introduce today.

Mobile DevOps differs greatly from server DevOps due to the differences between mobile and server scenarios. It is mainly reflected in the following aspects:

1) automated construction of mobile applications is more complex

first, the mobile environment is fragmented. The Android and iOS platforms need to build a building environment based on different operating systems and build toolchains. Even if the same platform builds a toolchain, version fragmentation exists. For example, Android build dependent Android SDK, gradle must be supported by multiple versions at the same time. Xcode and Ruby versions that iOS depends on must be supported by multiple versions at the same time.

Second, the Mac devices that iOS depends on are non-standard data centers, so they cannot be deployed in standard data centers. You usually need to build a self-built Mac Data Center, which is also a challenge for O & M and stability.

Automatic Building is an essential feature in DevOps, which requires mobile DevOps to solve the problems of automatic client building and one-click packet delivery through technical means.

2) mobile terminals are severely fragmented, and application delivery compatibility is a huge challenge.

Different from the consistency of the server deployment environment, the mobile application runtime environment is very fragmented, and the compatibility test coverage is much more difficult than that of the server. The fragmentation of mobile terminals is especially serious in Android systems:

Android there are many mobile phone manufacturers and numerous models in the market, different manufacturers will "optimize" the system at the underlying level. Theoretically, any model test that cannot be covered may face compatibility problems, the following figure shows 2020. According to the latest distribution of Android Top models of Baidu Statistics and Traffic Research Institute [2] In October, the market occupancy rate of the Top 10 models is less than 15%, which shows the serious fragmentation of models.

Differences in operating systems have a more direct impact on application running. It is common that major version upgrades lead to application incompatibility. Each release of a major version of an operating system is a test of application compatibility; users of the old system cannot be abandoned while considering compatibility with the new system.

The following figure shows 2020. According to the latest Android version distribution data of Baidu Traffic Research Institute in October, 10.0 of the Android have been released for more than a year, and the market occupancy rate is less than 50%. The operating system before 2 still occupied the mainstream.

Due to the fragmentation of terminal devices, mobile DevOps is required to have mobile testing capabilities and complete a large number of real-machine compatibility tests automatically.

3) Long app release and update cycle

A new version of the application may be released within two weeks, and the update rate will not exceed 50%. Unlike the server, the software of all servers can be released in a very short time. A long release cycle means a higher cost of making mistakes. It may take a long time for a Bug version to be released through updates and upgrades.

This requires mobile DevOps to have a complete phased release mechanism to avoid publishing problematic applications to users at one time. On the other hand, once a Bug version has been released, mobile DevOps is required to have the hotfix capability. You can release incremental patch packages to fix bugs more easily and quickly.

4) mobile applications run on massive mobile devices

Unlike server services running in a specific cluster, they can be managed and maintained in a unified manner. Mobile applications run in a mobile phone, and for Super apps like mobile Taobao, it is a massive device of hundreds of millions.

This requires mobile monitoring products to implement mobile O & M monitoring through big data technology, and even requires the remote log function to pull error logs from specified devices to locate and troubleshoot errors.

Based on the above points and referring to DevOps's definition of software delivery lifecycle, the following table summarizes the mobile DevOps application lifecycle and capability requirements for each phase:

2 What is Mobile DevOps

Mobile DevOps is a concrete implementation of the Mobile DevOps concept of EMAS.

First, let's introduce EMAS(Enterprise Mobile Application Studio), which is a leading domestic cloud-native Application research and development platform (Mobile App, H5 Application, Mini Program, Web Application, etc.) from Alibaba Cloud, based on a wide range of cloud-native Technologies (Backend as a Service, Serverless, DevOps, and low code), it is committed to providing enterprises and developers with one-stop application R & D management services, covering development, testing, O & M, and, the entire lifecycle of applications such as operations. For more information about EMAS, see The EMAS details page on the Alibaba Cloud official website.

Mobile DevOps is a concrete product output of the Mobile DevOps concept of EMAS. It is a mid-axis product of EMAS. It works with all EMAS products to realize the Mobile DevOps concept mentioned above. Mobile DevOps implements the linkage and complete closed loop of EMAS products isolated in each lifecycle of applications as shown in the preceding figure, and upgrades EMAS from the Mobile middleware platform to the Mobile R & D platform. Mobile DevOps combine the following EMAS products to form the Mobile DevOps of EMAS:

Mobile DevOps history

Mobile DevOps is a commercial output version of the Mobile R & D platform within the group. It was first developed by Alibaba Cloud and Mobile Taobao in 2017, the first public cloud version was released in April 2020.

The following figure shows the development history of Mobile DevOps. It can be said that the development history of Mobile DevOps is actually the development history of Alibaba Group's Mobile R & D technology, which is the accumulation of Alibaba's Mobile technology and engineering R & D philosophy in the past ten years.

Status quo of Mobile DevOps

1) Apsara stack has begun to take shape

Mobile DevOps, Apsara stack is mainly designed for key customers, especially those who are undergoing digital transformation. These customers have high security requirements and can only accept the mode of Apsara stack deployment, at the same time, they are willing to invest in improving the R & D efficiency.

In 2018, Mobile DevOps was officially launched in the Apsara stack scenario. Currently, it has created value for dozens of key customers in multiple industries, enabling digital transformation of enterprise R & D processes.

2) public cloud free beta testing

compared with Apsara stack, Mobile DevOps public cloud is more oriented to small, medium and micro enterprises. These customers have demands for improving R & D efficiency, but are sensitive to price. Public cloud is a good form of undertaking; at the same time, some external businesses (such as exclusive DingTalk) of Alibaba Group cannot be used for Mobile DevOps based on the group's internal R & D platform. Mobile DevOps public cloud is also a good choice.

Since 2020.07, Mobile DevOps public cloud has officially launched a free public beta test. Currently, it has served a large number of small, medium, and micro-sized customers, as well as Alibaba Group's internal exclusive DingTalk, government DingTalk, and singing duck customers.

Two-cloud native Mobile DevOps

compared with Apsara stack, the Mobile DevOps of building a cloud-native form in the public cloud scenario faces more technical challenges. This chapter will share with you our thoughts on building a cloud-native Mobile DevOps, challenges and Solutions.

1 Why do I need public cloud Mobile DevOps

provide inclusive Mobile DevOps services for small, medium, and Micro customers

although Apsara stack deployment has the advantages of exclusive access and internal network security isolation, the high cost of Apsara stack delivery is destined to be accepted by only high-end players in the industry. The cost of Apsara stack Mobile DevOps is assessed as follows:

based on the above cost calculation, the investment costs of Apsara stack in the first year, the second year, and the third year are respectively 1.5 million, 50 million, and 50 million, totaling 200 million, which is unacceptable for small, medium, and Micro customers.

As the infrastructure of the new era and the water, electricity and coal of the new era, it is necessary for Alibaba Cloud to provide inclusive cloud services for small, medium and micro enterprises other than major customers. The Mobile DevOps of the public cloud form is exactly in line with this concept. Based on the advantages of cloud-native elastic scaling and pay-as-you-go billing, the cost of using Mobile DevOps for small, medium, and Micro customers can be greatly reduced. In addition, in the public cloud scenario, the DevOps development process is more suitable for Target customers based on the characteristics of small, medium, and Micro customers.

Linkage EMAS product line provides developers with one-stop mobile R & D platform

the launch of public cloud Mobile DevOps can effectively link EMAS with existing Mobile testing, Mobile monitoring, Mobile hotpatch, and other products, allowing EMAS to cover the entire lifecycle of applications, upgrade EMAS from mobile middleware to mobile R & D platform to improve user experience and stickiness.

Compared with traditional self-built CI/CD platforms based on open-source solutions such as Jekins and Gitlab Runner, EMAS one-stop mobile R & D platform has obvious advantages in terms of cost, high availability, and technical support, in addition, it can cover the entire lifecycle management of application construction, testing, release, operation and maintenance, and Operation. Compared with the traditional self-built CI/CD "chimney" independent open source systems, R & D collaboration also has obvious advantages.

2 challenges faced by public cloud Mobile DevOps

compared with Apsara stack internal network deployment and internal staff usage, Mobile DevOps in the public cloud will face more technical challenges, mainly in the following aspects:

security

1) tenant isolation

the first problem facing public cloud is tenant isolation. Different customers need to use shared resources at the same time and cannot see each other's data. In this scenario, the building tasks of different customers may affect each other, and the building environment also involves users' private information such as code and certificates, A comprehensive solution must be provided to ensure the isolation of user building environments.

2) security of private data such as codes, certificates, and Keys

building involves user code, certificates, and keys. These data are extremely private data. Problems in the storage, transmission, and use of public cloud may cause significant losses to users.

3) external attacks

public clouds are exposed to the Internet and can be used by anyone. They also face the risk of malicious hacker attacks. In particular, a large number of custom commands are involved in building a cloud, A complete mechanism must be provided to prevent hackers from running malicious custom commands and leaving backdoors in the build environment.

High Availability

(1) elastic scaling is supported.

When the business scale of the public cloud grows, the business needs to be expanded quickly to adapt to the business growth. Otherwise, service exceptions may occur. This requires that cloud products conform to the distributed architecture in terms of technical implementation. In particular, cluster building must support stateless and rapid scale-out.

2) build environment stability

the build environment must be stable to avoid damage to the build environment caused by attacks or abnormal use, such as environment variables and build toolchains.

3) high-standard SLA, real-time online, never downtime

high-standard SLA is not only a commitment to customers, but also a reverence for Alibaba Cloud brands.

Scalability

1) large differences in construction processes caused by diversified application architectures

apsara stack has a limited number of customers and comprehensive KA customer technical support services. Therefore, there are limited application differences and special personnel to support access. However, there are many customers in the public cloud environment, and the diversity of application architectures puts forward higher requirements for the universality and scalability of the system.

2) diversified R & D processes

the R & D team size, R & D culture, and R & D process vary with customers in the public cloud. This also puts forward higher requirements for the scalability of Mobile DevOps R & D process.

3 Our solution

in response to the challenges faced by the above public cloud Mobile DevOps, we will solve them through technical means from the following two aspects:

pipeline-based general architecture

the pipeline architecture implements generic construction, customizes the orchestration and construction process based on the pipeline, and extends the pipeline business capabilities based on the task plug-in, which solves the preceding scalability problems. This architecture has the following features:

Build clusters based on containerization and virtualization

using the containerization (Linux)/virtualization (Mac OS) solution can completely solve various security and stability problems caused by resource sharing. Each build task is run by a new container/virtual machine, containers and virtual machines are destroyed immediately after the build task is completed. This can not only effectively isolate the running environments between tasks, but also prevent the build environment from being damaged; in addition, a stable stateless containerization/virtualization cluster can be built to ensure the high availability of the build service.

In the following three and four chapters, we will elaborate on these two points respectively and decrypt their design architecture and technical details.

Three general architecture based on pipeline

1 Industry Status

in fact, there are not many friendly products based on assembly line design in the industry, especially many similar products abroad, such as Azure DevOps Pipeline and Github Actions, which are two excellent assembly line products, compared with other products, comprehensive consideration of usability, documentation and user scale has many advantages.

Azure DevOps, formerly known as Visual Studio Team Services(VSTS), is a software R & D collaboration platform with a history of more than ten years. Its Azure Pipeline products were released in April 2018 [3]; github Actions product was released in August 2019 [4], which is a heavyweight product released after Microsoft acquired Github. In general, both belong to relatively new platforms, and the Azure Pipeline is only more than 2 years.

2 Core concepts of pipeline

Trigger

trigger, active trigger a pipelined execution.

Pipeline

the minimum unit that is triggered. A pipeline can contain one or more jobs.

Job

A Job is the smallest unit to be scheduled. It can be divided into two types: Agent (build cluster) and Agentless (server).

Multiple jobs can run in parallel without dependencies or run in sequence. The relationship between multiple jobs can be represented by a DAG.

Each Job can contain one or more steps.

Step

Step is the minimum unit to be executed. Each Job consists of multiple sequential execution steps.

Task

A Task is a Task plug-in with predefined specifications and features. It can be declared and referenced in a Step. A Step contains only one Task.

3 Technical architecture of pipeline

the pipeline consists of the following core systems:

Pipeline process engine

responsible for the triggering, orchestration, and state flow execution of the pipeline, as well as the maintenance of pipeline metadata information.

1) pipeline trigger module

the trigger module triggers the execution of a pipeline. It supports three trigger modes: manual, timer, and event (git event,webhook callback). Triggers are the only entry for pipeline execution. At this level, you can perform checksum checks by callers and pass in different trigger parameters to control pipeline execution and scheduling.

2) pipeline orchestration module

pipeline orchestration defines a DSL language used to describe a pipeline. Based on this DSL language, a pipeline that can be scheduled and executed can be accurately defined.

3) pipeline execution module

the pipeline execution module ensures that all jobs in the pipeline are executed in parallel or in sequence according to the correct dependencies, and updates the real-time status of pipeline flow in real time.

Job scheduling engine

A Job is the smallest unit to be scheduled in a pipeline. The Job scheduling engine is responsible for scheduling each Job generated by the pipeline process engine to the correct cluster machine.

Integration Engine

there are two types of task plug-ins in the pipeline. One is Agent tasks, such as Android and iOS. These tasks require a specific build environment, therefore, it is natural to think that tasks will be scheduled to the builder by the Job scheduling engine. Another type of tasks are Agentless tasks, such as approval, notification, and external system calls, this type of task can be completed on a common server without the need to consume valuable build resources, and is scheduled to be executed by the Job scheduling engine on the integration engine. Most Agentless tasks are related to external service integration.

Channel Channel service

Channel Channel is used to build the communication link and protocol implementation between the cluster and the server. Provides the following features:

(1) unified authentication of cluster building requests

for security reasons, the build cluster is in a different VPC from other microservices. By completely isolating the network, the build cluster cannot directly access the server intranet. Based on this background, the build cluster in the preceding pipeline architecture diagram accesses the server through the Internet HTTPS request, which requires authentication of the build machine request. Channel Channel is the authentication server port.

(2) unified closure of cluster building requests

to build a cluster, you need to keep heartbeat, status reporting, pulling tasks, and reporting task execution status with the server in real time. Channel, these requests are closed, distribute requests from different businesses to different microservices.

Build a cluster

build a cluster is mainly responsible for pulling and executing Agent construction tasks. The services running in the build cluster are responsible for starting the isolated build environment that matches the task type:

1) start the Docker container on Linux

Android is built based on Linux. Docker containerization on Linux is the best choice for environment isolation. Starting ACK serverless containers based on serverless Docker (Alibaba Cloud Public Cloud K8S products), automatic destruction and recycling is completed. The cloud-native ACK serverless maximizes the elasticity of building clusters, and does not consume any computing resources, greatly controlling the construction cost.

2) start the virtual machine Mac OS the platform

due to the limitations of Apple's ecosystem, iOS and Mac App can only be built in Mac OS systems. Currently, Mac OS does not have mature Docker-like container solutions to use. Finally, we implement environment isolation based on virtualization solutions. We have built a cloud-based Mac virtualization cluster to pool Mac physical resources and quickly scale up and down the cluster. This fully complies with cloud native concepts. Each time a virtual machine is created, a virtual machine is dynamically created from the Virtual Cluster. After the virtual machine is created, it is immediately destroyed.

It is worth mentioning that Mac Virtualization clusters are our technical advantages. In the following chapter 5, we will Mobile DevOps the practice in the direction of Mac Virtualization clusters in detail.

4. Build a cluster through Mac Virtualization

at present, Mobile DevOps Mac virtualization cluster construction solutions are absolutely in the leading position in China. We may be the first DevOps platform built on iOS based on Mac virtual technology in China, there are almost no domestic manufacturers that support iOS construction. The essential reason is the limitation of Mac Virtualization Technology: traditional Mac physical bare metal construction can only be used in internal environments, public cloud services are not available at all. The Mac virtualization cluster building solution is Mobile DevOps technical advantage.

1. Select a virtualization solution

due to the kernel limitation of the Mac OS platform itself, the containerization solution Mac OS the platform is extremely immature at present, Mac OS the environment isolation of the platform is basically the only way to virtualization.

Selection of virtualization types

two types of virtualization solutions are shown in the following figure. Both are implemented based on Hypervisor:

Mobile DevOps, as a cloud product that provides services to the public, choosing solution 1 can improve the resource utilization rate more effectively. As for hardware compatibility, you can avoid it only by choosing a production solution and choosing an appropriate hardware product.

2 Cloud architecture virtualization cluster

to provide public building services on the cloud, virtualization solutions alone are not enough. A virtualized cluster solution that conforms to the cloud architecture is also needed to meet Mobile DevOps requirements for building clusters:

  • Mac hardware resource pool: each Mac resource in the cluster should be stateless. All Mac hardware resources together form a resource pool, which can be uniformly allocated and scheduled by the cluster.
  • Elastic scale-in: the scale of public cloud services is flexible, which requires that virtualized clusters can also adapt to business scenarios, and can quickly scale up and down flexibly to keep pace with business growth.
  • High Availability: When some Mac hardware devices are damaged, the cluster can quickly and automatically respond to tasks assigned to new virtual machines to improve the success rate of task execution.

From a single virtual machine to a virtual machine cluster, in addition to the preceding Mac hardware resource pooling, it also solves the newly introduced distributed storage and distributed network problems after hardware resource clustering, from a virtual single machine to a virtual cluster as shown in the following figure:

V. Future Outlook

currently, the public cloud Mobile DevOps is still in the beta phase, and there are still many aspects to be worked on:

  • intelligent analysis and prompt of build errors. In the case of a large number of public cloud users, building incorrect Q & A is a huge labor cost. In the future, keyword matching, big data analysis, even technical means such as AI automatic error classification directly prompt the cause of construction errors, reducing the cost of manual Q & A.
  • More collaboration with other EMAS products allows Mobile DevOps to integrate the entire application R & D lifecycle.
  • Maintain better affinity with the community. Supports Pipeline migration from Github Actions, Azure Pipeline, and other platforms to Mobile DevOps. Task plug-ins directly support Github Actions more than 5,000 open-source plug-ins, enjoying the benefits of the open-source community.
  • Strengthen the ability to be integrated, so that Mobile DevOps Mobile R & D platform can be better integrated into the customer's existing R & D process.
  • Optimize application compilation and build efficiency to reduce application build time. The ultimate goal is that the duration of building applications on the cloud is significantly shorter than that of building applications locally, allowing developers to intuitively feel the advantages of building applications on the cloud.

References [1]Azure DevOps: What Is DevOps? https://azure.microsoft.com/zh-cn/overview/what-is-devops/[2]百度统计流量研究院https://tongji.baidu.com/research/app[3]微软发布Azure Pipelines, open-source projects can use CI/CD https://www.infoq.cn/article/2018/09/microsoft-azure-pipelines[4]所有开源项目免费使用,GitHub内置CI/CD终于来了 without restrictions! https://www.infoq.cn/article/D0mTaPbgpBHF3r-Cuvf3

Operations Cloud Native Devops apsara stack scheduling virtualization Android development iOS development MacOS container
developer Community> alibaba technology
Selected, One-Stop Store for Enterprise Applications
Support various scenarios to meet companies' needs at different stages of development

Start Building Today with a Free Trial to 50+ Products

Learn and experience the power of Alibaba Cloud.

Sign Up Now