What is the essence of Code Review (CR)? Is it for error checking? Or for KPI? This article shares the views of Alibaba's Senior Technical Experts. CR is a long-term practice and orientation of organizational culture. Through CR, a benign interactive technical atmosphere was formed, spreading and sharing knowledge and improving code quality. This article also gives seven practical suggestions to improve the efficiency and quality of CR.
There are many articles about CR. CR is a standardized practice of many organizations, but many teams have the following questions when they try CR:
These questions are common and have been raised in different contexts for years. To answer these questions, you only need to focus on two aspects:
Many people think CR is used for error checking and hope to use the number of defects in the code to determine the effectiveness of CR. This underestimates the value of CR. The essential role of CR is not problem discovery. If we aim to discover problems, we have more and better ways than CR. The role of CR primarily involves the culture and sociology of the organization and is a long-term practice.
You are coding all out, and your deadline is looming. In addition, your organization requires unit testing of code: All new code must have complete automated unit testing. However, under pressure, you may want to lower the coding expectations a little. You may want to write the unit test for the new code later or write a slightly less useful test to meet the coverage requirements of the tool, for example, tests without assertions.
However, when you know that your code will be sent out for review by your colleagues, does this idea seem less enticing? This kind of pressure is benign and can give you immediate results, preventing you from choosing the "speculative" behavior of short-term gains and long-term losses. If there were no CR, you might be tempted to "do whatever you want," but you would still have to pay for this kind of trickery.
There are many ways to fulfill the same software requirements. Interested readers can search for "Hello World's N Ways of Writing" by themselves. Although all roads lead to Rome, the costs of roads are different. From variable naming to design structure, if you adopt a less common method, you may be causing trouble for later code maintainers. This maintenance activity may occur after one month, after 1 year, or longer. At that point, as the code author, you are no longer on the team, and no one can understand why the software was designed this way.
CR forces this feedback cycle ahead of time. Immediately after the code is written, there are one or more readers that are code reviewers. Therefore, this code underwent readability verification immediately after it was written. Ideally, if the organization already has coding specifications and design specifications, it can also ensure that the code follows these specifications. If it is found in the CR process that this code does not follow the specifications, it can also provide an advantage. It points to another key value of CR: knowledge dissemination.
You may be the leader of a team, worrying about how to improve the programming skills of your team members. You want them to read books, so you bring in introductory books, such as Clean Code. You also introduce the classic Design Mode and recommend Domain-Driven Design. You may periodically organize team members to share their business knowledge in the hope that the team members can understand the business logic of products.
All of these efforts are very good, but it is also possible that you will be hit. A month later, it seems that the team members have established a concept for the naming convention, but there is no consensus on how to perform specific naming. Many of the team members already know some design patterns. However, some use excessive patterns and make the code bloated, while others only know singleton. In addition, the team members are contentious about what an entity object is and what a value object is, and no one can clearly say what aggregation is and which scenarios it applies to.
What you really lack is a scene. It is difficult to form a consensus if abstract concepts are not implemented in practice. Some people may know the concept of "precedent" in the common law legal system, which is a very good way to form a consensus at the legal level. CR is forming a precedent: which type of design is reasonable and which type of design is unreasonable? By analyzing specific problems, the team will gradually form a design consensus. In the process, new team members that are not familiar with the consensus can also slowly integrate.
CR can also be used to verify logical correctness.
To prevent CR from being abused and raising expectations too high, we must first declare one thing: It is the designer's responsibility to ensure the logic of the code is correct. If a null pointer error occurs, how can we determine whether the coding is bad, the CR is bad, or the test is bad? First, the code author must have created this problem. Is it fair to blame this problem on the code reviewer? Perhaps, the code reviewer is responsible for discovering such problems, but what if the code is inherently buggy? What if the code reviewer has reviewed 1,000 lines of code at a time but cannot review the entire code to find this problem? As a code reviewer, one can find a lot of reasons why such a null pointer error is missed.
You have many other methods to find errors, and their cost is often not high:
Regardless of whether there is CR activity, the preceding two points are actions that a professional developer and development organization should strongly advocate.
Under the premise that the preceding two points are recognized, CR should be used to find errors. It essentially establishes a redundancy mechanism, through multiple people working on the same piece of code, to discover any negligence and cognitive errors that may exist in the code, which are often difficult to find for a single developer.
CR itself is not difficult, but if you consider the following factors, it may be more complicated:
Studies show that successful CR activities must be small-scale. For example, the paper "Modern Code Review: A Case at Google" states that in the CR activities of Google, only one file was modified for 35% of the CR activities, less than ten files were modified for 90% of the CR activities, and only one line of code was modified for 10% of the CR activities.
There are two main benefits of reviewing small amounts of code at a time: It is very clear where the changes are made, and the problem will be clear at a glance. If you push more than 1,000 lines of code to someone at a time, it is extremely unlikely to obtain a valuable review.
A review should be performed for a code of a meaningful and complete feature. If the code is not to fix defects, developers are required to develop features iteratively.
Small batches will inevitably lead to multiple batches. In the 2013 Microsoft paper "iExpectations, Outcomes, and Challenges of Modern Code Review," and the aforementioned paper from Google, the practice of frequent reviews was mentioned. Among them, Google's weekly median number of code changes per developer is three, and the weekly median number of reviews per reviewer is four.
Who is suitable for reviewing your code? It is unwise to choose someone that has nothing to do with the code being reviewed. The best potential candidates are listed below:
There are already some tools that can recommend reviewers depending on the context. This facilitates the selection of appropriate reviewers.
When the granularity of each review is not large and frequent reviews are needed, quick response is possible, which is also an inevitable requirement. Google's median number of review hours per reviewer is four hours. This metric can be a good reference.
A high-quality review that provides quick response depends on modern tools. Modern review tools can automatically integrate into workflows, highlight changes, and automatically summarize changes. There are already many modern tools available, and readers that have not yet selected the tools can search by themselves.
When it comes to "small batch, multi-batch, and fast feedback," if you have experience in pair programming, you will immediately realize that pair programming is very similar to CR.
Pair programming means two people that are co-programming have the same context, so there are no problems with context switching, no problems with lack of time, and no need to use additional tools to provide feedback to each other anytime or anywhere. In my opinion, pair programming is the best CR.
Online reviews should be normalized behavior. When it comes to the "knowledge dissemination" value of CR, offline reviews are a useful supplement. Experienced teams will organize offline reviews periodically or irregularly, so they can obtain a wider range of knowledge dissemination than online reviews. This can also lead to heated discussions and debates, forming a consensus of higher quality.
The full digitization of R&D activities has brought some valuable data insights. CR activities can be continuously promoted through the observation of metrics if the tools can support it. We summarized the following suggested metrics and data:
From the code authors' perspective:
From the reviewers' perspective:
From the perspective of the reviewer group, you can also find the community relationship between the code author and reviewer, which is also valuable information to understand the spread of knowledge.
We recommend following the instructions below:
Avoid using the digital metrics of CR for assessment, even if the motivation is good. The essence of CR is cultural construction. We strongly recommend using CR metrics only as guidelines for improvement, not for performance-related evaluations, whether the CR metrics are the aforementioned ones or the data related to the review quality or defects found in reviews.
What is the code quality of your organization? How about the technical atmosphere of your organization? Are you implementing CR?
I hope the practical suggestions I've shared in this article are helpful. If you have any suggestions about CR, please share them with us.
Alibaba Cloud New Products - November 10, 2020
Alibaba Developer - April 7, 2020
Alibaba Developer - September 22, 2020
AliCloud-Data Middle Office - August 25, 2021
Alibaba Developer - March 3, 2020
Alibaba Clouder - April 2, 2020
Penetration Test is a service that simulates full-scale, in-depth attacks to test your system security.Learn More
Plan and optimize your storage budget with flexible storage servicesLearn More
Link IoT Edge allows for the management of millions of edge nodes by extending the capabilities of the cloud, thus providing users with services at the nearest location.Learn More
High Performance Computing (HPC) and AI technology helps scientific research institutions to perform viral gene sequencing, conduct new drug research and development, and shorten the research and development cycle.Learn More