Bolster the growth and digital transformation of your business amid the outbreak through the Anti COVID-19 SME Enablement Program. Get a $300 coupon package for all new SME customers or a $500 coupon for paying customers.
By Jiao Ba, Senior Technical Expert at Alibaba
The following is a shared record, with abridged content.
The software R&D team encountered two core problems when working from home: team communication and the R&D process. Members of the Apsara DevOps team are distributed throughout multiple cities, and they used to hold online meetings for communication. For this reason, the communication and coordination between the team members were less affected when they began to work from home. However, there were still communication problems among members of subteams. In ordinary times, when members of subteams work in the office, they can handle issues efficiently through face-to-face communication. But this is not possible with remote communication. After more than 10 days of adjustment, we gradually solved this problem and improved the efficiency of communication. This article uses the Apsara DevOps team as an example to demonstrate the differences between working in the office and working from home.
Issues communicated during the morning meetings and weekly meetings were basically unchanged, except that the format of the meetings was different. We all communicate face to face when working in the office, whereas we communicate through teleconference and video conference when working from home. In addition, to improve communication efficiency, we needed to synchronize personal work items and clarify the subject of the meeting in advance. When working from home, in addition to the weekly report, we use the daily report to disclose the progress of personal work and possible risks before leaving work every day.
In terms of the R&D process, if your team does not adopt the standard online or GUI-based R&D process, the team may face challenges. Once you encounter problems or failures in the release process, the problems may be magnified because it is less convenient to find somebody else to help you when you work from home. Alibaba has always done a good job in the R&D process. We use the Aone (internal name for Apsara DevOps) platform to carry the entire R&D process, including development, building, deployment, and secure production.
When working from home, we solve the core problems of team communication and the R&D process using agile R&D and continuous delivery. Next, we will introduce in detail how we solve the problems.
Agile R&D is actually a mature methodology. As the saying goes, there are a thousand Hamlets in a thousand people's eyes. Each R&D team should combine the theory with its own practice and find a set of methods and mechanisms most suitable for the team. Based on the practice of the Apsara DevOps team, we believe that agile R&D should focus on iteration, the key of which is asynchronous communication.
So, what do we mean by that? Iteration is a break-down of the long-term or ultimate goals. When a big goal is broken down into small ones, our team can become more aware of these small goals. When the iteration is centered, we further break down these small goals into work items or cards on the Kanban and distribute them to specific task owners. This forms the basis of asynchronous communication.
What are the advantages of asynchronous communication over synchronous communication? Firstly, context can be accumulated in asynchronous communication. In asynchronous communication, the content of our communication will be recorded on each specific work item. However, such content is simply communicated verbally in synchronous communication. Secondly, asynchronous communication also allows team members to clearly define their goals, keep them focused, and reduce interruptions, thus improving efficiency.
The Apsara DevOps team divides work items into three categories: daily defects, project requirements, and product requirements. Daily defects involve maintenance of products that have been launched, and the defects come from user feedback or self-testing. Project requirements are relatively complex. They feature a long delivery cycle and a specific delivery time and come from the enterprise customers or internal requirements of an enterprise. Product requirements are public-oriented in most cases and need to evolve continuously. The following describes how the Apsara DevOps team handles the three categories of work items.
How do we deal with daily defects? First, the Apsara DevOps team divides daily defects into four subcategories: emergency defects that must be fixed immediately, defects that must be fixed in one week, defects that must be fixed in two weeks, and defects that do not need to be fixed. Why do we use these subcategories? This is because defects are more specific and have fewer changes than requirements. We can basically confirm the time required to fix the defects when we are assigned to handle them.
Second, defects do not occupy story points, which means that you are expected to fix the defects in your spare time, and the time required to fix the defects will not be included in your normal working hours. This actually establishes a positive incentive mechanism. Work items like tasks and requirements will bring certain defects. If your work items are completed with higher quality, you will have fewer defects, and vice versa. This mechanism encourages everyone to do their best to complete their work items.
Third, defects are not reviewed in the morning meetings. On the weekly meeting, however, the defect handling progress will be reviewed, and new defects will be categorized and assigned.
This mechanism is simple and easier to implement. After the Apsara DevOps team implemented this mechanism to handle daily defects, the quality of our products has greatly improved and is better guaranteed.
How do we deal with project requirements? Generally, a project requirement has a specified deadline and is clearly defined. Based on the specified deadline, we can determine the key time points and define the milestones, which allow us to better control risks. Note that the content of milestones must be quantifiable and observable. Then, we iterate based on the milestones, clarify requirements, and assess story points before each iteration starts. This is consistent with the agile R&D methodology. It should be noted that a team should cultivate members' all-round skills as far as possible. In this way, the work items or cards that we break down from iterations can be assigned to any developer, instead of being bound to a specific person. This prevents technical bottlenecks due to the incompetence of a certain developer.
How do we deal with product requirements? Product requirements are handled in a similar way as project requirements. For both the product requirements and project requirements, we need to clarify requirements, assess story points, and then take over task cards and provide risk warnings on a stand-up meeting. The main difference is that the iteration cycle of a product requirement is relatively fixed, which is beneficial to the long-term stability of products. If the iteration cycle varies greatly in length, it means that the number of cards processed by developers in the same R&D cycle differs a lot, which may cause differences in delivery quality. Another difference is that the iterative goal of a product requirement is generally based on feedback from users, markets, and data, which leads to some uncertainties. Therefore, it is required that we clarify the requirement and review it in a more detailed manner.
The Apsara DevOps team has obtained good results in practicing agile R&D, and team members also have gotten a great sense of accomplishment. In our experience, it is very important to find the rhythm of the team. We hope that you find the rhythm of your own team when practicing agile R&D, and find an agile R&D mechanism that best suits your team.
This section describes how the Apsara DevOps team standardizes the R&D process through continuous delivery. Continuous integration and continuous delivery are implemented through the pipeline. In addition to the pipeline, we will introduce some distinctive practices adopted by the Apsara DevOps team, including the test environment (isolation of the development environment from the test environment under the microservice architecture to realize cloud-based development), branch management (process-based management on code branches and static configuration items under multi-person R&D collaboration), and production security (software delivery assurance, process standardization, and traceable delivery).
The first is the test environment. Let's first take a look at the background of this solution. In the past, we used to develop giant applications. With the evolution of the microservice architecture, giant applications began to be split into many small applications. While the microservice architecture brings us benefits, it also creates new challenges for the development process. With the continuous increase in the number of applications, the application link will become very long. The overall development resources are limited and unstable, increasing difficulties for the entire development and debugging process. However, you need an exclusive environment in the development process. How can we solve this problem?
The first method is to pull up all applications in the exclusive environment. This method has some drawbacks. First, as the number of applications increases, it is impossible for every application developer to pull up all the applications in the entire environment when the development resources are limited. Second, as the number of applications increases, the entire microservice architecture becomes hard to describe, and it is difficult to pull up all applications even once.
The second method is now widely used. We first establish some common basic environments, such as the test environment and the pre-release environment. When we need to develop software, we can start a service or application locally and conduct joint debugging with the common basic environments. This method also has drawbacks. First, when developing a feature, you may need to modify more than one application and deploy these applications in the common basic environment. However, the services or applications are unstable in the development process, which in turn causes instability of the common basic environments. In addition, this operation results in the preemption of common basic environments. Such preemption makes the common basic environments bottlenecks in the development process, which greatly affects the development efficiency.
Based on our preceding experience and practice, Alibaba has designed an isolated environment solution. As shown in the preceding figure, when you need to develop some features, you do not have to deploy applications or services in the common basic environment. Instead, you only pull up some resources for your feature development and deployment, connect this feature environment with the common basic environment, and properly isolate the two environments from each other. The same resources can be shared, but requests are isolated from each other. This design has two advantages. First, you will not occupy a lot of development resources. Second, the stability of the common basic environment is not affected.
The feature environment is a virtual environment. It appears that each feature environment is an independent and complete test environment that consists of a cluster of services. In fact, except for the services that the current user wants to test, other services are virtualized through the routing system and message-oriented middleware and are directed to the corresponding services in the common basic environment.
This testing technology has been evolving for several generations in Alibaba. At the beginning, the middleware used, such as microservices and message queue, need to be transformed to support this type of isolation mechanism. Later, with the development of cloud-native technology, we have been using the capability of Service Mesh for isolation. In addition, we have developed a product, KT Virtual Environment, which is now open-source. We welcome your feedback about any defects this product may have.
The next is branch management. The Apsara DevOps team and Alibaba's internal R&D teams basically adopt the branch management model named AoneFlow. This branch management model has been generated through years of accumulated experience in practical application. It manages feature branches and static configuration items by changing the model. Merging of and conflicts between code branches and static configuration items are all handled through GUI. Unlike the common fixed branch management model, AoneFlow provides dynamic release branches that allow the flexible combination of features and fast addition and removal of features.
Why should we use dynamic release branches? First, the entire integration verification becomes extremely difficult under the microservice architecture as compared with traditional giant applications. You need to perform integration verification with other applications in a common environment. Even if your code runs correctly during standalone verification, it is difficult to ensure that your feature branch is reliable when it undergoes integration verification together with other applications. If a problem occurs, your feature branch must be removed from the release branch. Second, when multiple people collaborate to develop a code branch, it is difficult to ensure that no problems occur when your code is integrated with the code of other people. In addition, the higher the release frequency, the greater the instability. In particular, internet enterprises feature a fast iteration rate and a high development frequency, and accordingly, the possibility of conflict is also very high. In this case, it is difficult for you to ensure that your code branches are reliable during integration verification. Both cases require that the code branch can be added or removed quickly.
The last one we would like to share is production security. In the test environment and branch management, we focus on improving the efficiency of continuous delivery. A more important task is to ensure the delivery quality so that the release process is free of faults.
First, we need to establish a series of security mechanisms, such as security scanning and code review, to perform tests in advance so that defects can be detected in the development phase. Second, these mechanisms cannot be mere verbal conventions, and we need effective tools to manage them. The Apsara DevOps team turns all these mechanisms into checkpoints or red lines, integrates them into the R&D process, and implements them through the Apsara DevOps pipeline. In addition, to balance the efficiency, the Apsara DevOps team imposes quality requirements on increments by setting up unit tests, static code scan, integration testing, coverage rate, and other quality checkpoints or red lines for increments. Third, the objective is to achieve global control through manual reviews and release control. Human participation and the aforementioned technical measures can complement each other to ensure the security of production.
Soon, a new version of Alibaba Cloud Apsara DevOps will be launched and will bring all-new product features and user experience. This is a product that we developed together with many small and medium-sized enterprise developers after listening to the feedback of developers from various channels.
Apsara DevOps is an enterprise-level all-in-one DevOps platform based on Alibaba's advanced management concepts and engineering practices. It aims to be the R&D efficiency engine for digital enterprises. Apsara DevOps provides end-to-end online collaboration services and R&D tools covering the entire lifecycle, from requirement collection to product development, testing, release, maintenance, and operations. Using artificial intelligence and cloud-native technologies, developers can improve their R&D efficiency and continue to deliver valuable products to customers.
While continuing to wage war against the worldwide outbreak, Alibaba Cloud will play its part and will do all it can to help others in their battles with the coronavirus. Learn how we can support your business continuity at https://www.alibabacloud.com/campaign/fight-coronavirus-covid-19
Alibaba Clouder - April 2, 2020
Alibaba Cloud MVP - March 4, 2020
Alibaba Clouder - October 15, 2018
Alibaba Clouder - December 14, 2017
Alibaba Clouder - September 15, 2020
Alibaba Clouder - August 25, 2020
More Posts by Alibaba Clouder