This article is from Alibaba DevOps Practice Guide written by Alibaba Cloud Yunxiao Team
We will inevitably face expanding services, increasing application complexity, and more difficulty with sustainable testing in the business development process. On the one hand, as the number of test cases is constantly increasing, one CI verification activity takes more than ten minutes. The maintenance cost of test cases is growing, causing the development efficiency to decrease. On the other hand, we take efforts to write many automated test cases, hoping to improve the input-output ratio and test effectiveness.
The core idea of distributed testing is to concurrently execute test cases by increasing computing resources. Then, parse and merge the structured test results generated after execution. This method can speed up the execution of a single test.
The entire test execution process can be divided into the following three phases:
Let's use a project from the Alibaba Cloud Core Team as an example. The project has more than 10,000 unit test cases. Before the distributed testing solution was adopted, a CI verification task took more than four hours, which extended the time to discover and fix the defects and affected the daily iteration speed. Later, the team adopted the distributed testing method. Thus, the average execution time was reduced to less than half an hour.
In distributed testing, execution resources are stacked in exchange for faster execution. Theoretically, we split each test case and execute it in one container to get faster feedback speed. However, not all scenarios are suitable for distributed testing. For example, if dependencies exist between test cases, these test cases cannot be simply distributed to different execution groups.
Distributed testing largely improves the test execution speed. However, if all test cases are executed without distinction under any circumstances, the following problems may occur:
In the process of finding a solution to improve test effectiveness, we introduced the precision testing solution.
Precision testing refers to the process of associating test cases with business methods, precisely recommending test cases to be executed when the code changes, executing recommended test cases, and giving feedback about execution results. A precise definition of the test scope can bring benefits to efficiency and execution speed.
A baseline can be established in multiple ways. Alibaba has used either of the following methods to associate test cases with code:
If the code is changed, the changed test cases and business methods will be parsed out accordingly. Changes to methods are usually addition, deletion, and update.
The code change of test cases is relatively simple, and every new and updated test case will be included in the regression scope.
Changes to the code of the tested application must be considered from the following aspects:
The precision testing solution was applied by a core cloud product of Alibaba Cloud to test code changes that were iterated for about a week. Previously, more than 3,700 test cases needed to be fully executed in each CI task. Now, each CI task can precisely execute the test cases impacted. The speed has increased by nearly 50%, and the test scope has narrowed to less than 1/6 of the original scope.
We hope test cases that are written and run can cover the logic of the code effectively. One of the important starting points is test coverage. The test coverage can be checked to detect problems and facilitate the problem-resolving.
In this article, test coverage refers to code coverage, which is also called line coverage or branch coverage. In addition, Java is used in all examples involving coverage data collection in this article.
An application typically has multiple types of automated test sets:
It is necessary to aggregate the results of multiple types of automated tests to accurately reflect the complete coverage of an application. At Alibaba, each test is associated with the corresponding application so the test result can be aggregated.
We made the following efforts to collect the test coverage data of all types of automated tests:
For unit testing, test coverage data is generated on the machine where tests are executed. We will create the unit testing coverage report based on the original code information, compiled class information, original coverage data files generated after test execution, and changed code information.
We implemented a coverage data collection client and a platform for coverage data collection/report calculation and parsing. Then, we will deploy the client in the application integration environment through the O&M platform and mount a javaagent process when the application starts. When we trigger any type of automated tests on any test platform, the test platform will notify the platform we created to interact with the coverage data collection client to collect and parse the raw data for coverage calculation.
In addition, we will merge the corresponding unit testing coverage data to form a complete coverage report during the process of the key-point release test.
In addition to the aforementioned test coverage, we can use the same technology to observe the code coverage of online businesses. In other words, we can figure out lines of code used by online businesses and the lines that have never been called. This way, we detect the obsolete code in the programs, deduce the improper code design points, remind designers and developers to sort out code logic, and improve code quality.
We can make full use of complete test coverage data. However, it is meaningless to pursue higher test coverage without specific scenarios. Blindly pursuing high test coverage often results in the over-designing of test cases. Alibaba focuses more on incremental coverage to improve the entire test coverage in a healthy and effective way.
Incremental coverage refers to the test coverage of changed code during a test.
The changed code equals the difference between the code of the branch under test and the code of the target branch. (Usually, the target branch is the branch we finally merge.)
Incremental coverage equals lines of changed and overwritten code/lines of changed code
In the unit test, incremental coverage data of the unit test will be generated. The incremental coverage data will also be generated when interface automation/UI automation/traffic playback/manual tests are performed in test/pre-release environments. When the incremental coverage data is combined with the continuous delivery pipeline, it can ensure that project quality is moving in a better direction.
Alibaba's internal practice usually sets various points in the key stage of the CI/CD pipeline. This method ensures that each developer has performed sufficient self-testing activities before committing code for integration in scenarios involving collaboration between multiple personnel. In the integration phase, the committed code must be fully verified in the integration environment before it is officially released.
Based on the feedback about the incremental coverage, developers and testers can supplement specific test cases to make sure that no code lines are missed in each phase. In the process of continuously improving the test sets, the overall test coverage of the project is improved in a healthy and effective manner.
We do not usually collect coverage data in the online environment. However, if we collect coverage data in the online environment, we can make the application slimmer and improve efficiency from another perspective.
Generally, multiple replicas are deployed for services provided by online businesses. We collect coverage information from a small number of replicas to reduce risks. After a long collection cycle, the online coverage report will be generated. At this time, the covered code can be considered valid code, while the remaining code that has not been used by traffic for a long time may need to be deleted or restructured. As such, we can simplify the code to reduce maintenance costs.
At Alibaba, when we decide to reconstruct an application that is difficult to maintain or significantly corrupted, we will collect the online coverage data for a period and reconstruct and simplify the code based on the generated report.
Distributed testing speeds up test execution, precision testing identifies the test scope, and incremental coverage provides favorable guidance for the continuous testing improvements. Online coverage helps us reduce the size of our applications. We can improve testing efficiency and create a smoother continuous delivery process by making full use of these technical means.
Test Environment and Routing - Alibaba DevOps Practice Guide Part 14
991 posts | 241 followers
FollowAlibaba Cloud Community - February 5, 2022
Alibaba Cloud Community - February 8, 2022
Alibaba Cloud Community - February 18, 2022
Alibaba Cloud Community - March 2, 2022
Alibaba Cloud Community - February 4, 2022
Alibaba Cloud Community - February 18, 2022
991 posts | 241 followers
FollowAn enterprise-level continuous delivery tool.
Learn MoreA unified, efficient, and secure platform that provides cloud-based O&M, access control, and operation audit.
Learn MoreManaged Service for Grafana displays a large amount of data in real time to provide an overview of business and O&M monitoring.
Learn MoreAccelerate software development and delivery by integrating DevOps with the cloud
Learn MoreMore Posts by Alibaba Cloud Community