Analysis and comparison of advantages and disadvantages of 4 common branch modes

Introduction: The essence of team R&D is not that the larger the team size, the higher the R&D efficiency. We thought that the larger the team size, the higher the R&D efficiency, and the more things that can be done, but we found that the entire R&D efficiency will decline if the team size is large to a certain extent, even very fast. Why is our team getting bigger and bigger and our releases getting slower and slower?

【4 common branch modes】The essence of team development

We once came into contact with a company that had only 8 people at the beginning. At that time, one or two versions could be sent out every month, and customers could use it, because they were doing the information management HIS system of the hospital. They feel they are doing well. Later, the team developed rapidly, and the scale reached about 80 people, but no version was released for half a year. This caused the implementation team to be embarrassed to meet the customer, because the customer said that the requirements for half a year could not be issued.
At this time, the paradox comes: we thought that the larger the team size, the higher the R&D efficiency and the more things that can be done, but we found that when the team size is large to a certain extent, the overall R&D efficiency will drop, or even drop. very fast.

From the perspective of a team, because of the large number of people, collaboration is getting slower and slower, and the cost of collaboration is getting higher and higher. We found that the R&D model of the team, the more people there are, the more problems there will be, because there are more conflicts and more waiting. The conflict here refers to the conflict in the process of code integration and release, and the waiting is also the waiting of the code for each other in the process of integration and development.

【4 common branch modes】The following are two specific scenarios.

Suppose there are two programmers A and B working together. At the beginning, A submits the work to the online successfully every time, and then B submits a version, which causes the compilation to fail. At this time, A cannot submit, because the submission will hang, and it will wait for B to fix the problem before submitting. At this time, A's submission conflicts with B's work.
In the second case, multiple branches are merged into the same branch. FeatureA is merged into the trunk first, and FeatureB is a little later and found that it cannot be merged because the baseline is different. At this time, the code conflict must be resolved before it can be merged.

As shown in the figure above, suppose there are 3 people, A, B and test C, each person's point represents the task it does, such as A has been doing its own thing, every time it completes one thing, it starts to do the next thing, and the third thing is done. When something happened, he felt that he needed to go to B for adjustment, so he sent a ping to B, but B had his own rhythm and was busy with his own tasks, so he did not immediately respond to A's request. He found that there was a task that could be tested, so he told C. C found that there was a problem and immediately went back, but at this time, B was busy with another task and did not respond. C found that B did not respond, and sent Pong again. At this time, B saw the messages from A and C. He dealt with A's affairs first, and replied to A with a Pong message.
We found that between programmers and programmers, testers and developers, the entire development collaboration is actually an asynchronous, delayed collaboration process. Everyone does not respond immediately upon receiving a request, but collaborates immediately, often at their own pace and with their own actions, which may cause delays. Therefore, when the product is more complex, the collaboration is more, the team is more complex, and the team has more people, the cost of collaboration will rise rapidly.
In such an asynchronous and delayed collaboration process, programmers face the daily development work and need to have a corresponding R&D model to ensure that information can be continuously synchronized and quickly responded to during the collaboration process.
The software delivery process is essentially a collaborative process of developers around the code base. Whether it is product code, configuration, environment and release process, it can be described by code and saved in the code base.

【4 common branch modes】

Therefore, the purpose of the development model is to constrain our behavior when working around the code base, which is essentially a behavioral constraint around the code base.
We narrowly understand the R&D mode as the branch mode, which includes a series of behavioral constraints, such as branch type and its identification, the life cycle of the branch, the flow of Commit between branches, and the constraints of the flow, as well as between the branch and the code. corresponding relationship, etc. Next we will discuss them one by one.
The R&D model is a series of constraints on R&D behaviors, with the goal of avoiding conflicts and reducing waiting. In the process of collaboration, the biggest problem brought by more people is more conflicts and more waiting. Therefore, a good R&D model should avoid conflicts as much as possible and reduce waiting as much as possible.
First, let’s look at the correspondence between R&D models and R&D behaviors.

These development behaviors and codebase behaviors have a Mapping (mapping) relationship. When starting new feature development, we create a new feature branch. Doing a code submission and integration is actually a Commit and Push. After completing the integration verification, a branch Merge is done.
Similarly, when the integration is completed and it is ready for release, it is also doing Merge, and completing the release means hitting a Tag. The operations in the code base record our R&D behavior, so the R&D behavior and the operation of the code base can be mapped one-to-one.

The only way to avoid conflict is for everyone to be isolated from each other, and there is no conflict when separated. In the code base, branching is often used to isolate work and avoid conflicts.
To reduce waiting, and waiting is caused by information asynchrony, and to achieve information synchronization as much as possible, there is no need to wait. Waiting in code is the synchronization of baselines between code, such as frequent commits. So in fact, branches are used to avoid conflicts and work isolation, and frequent submission and merge is to synchronize information and reduce waiting.
Q: If you are doing software development by yourself, what kind of branching mode do you use? Can a person have conflict?
When a person is doing software development, there will be no conflicts. When a person is working, they do not need many branches, and one branch is enough. When one person does development, there is no need to wait for information, so one trunk can go to the end. But if the number of people expands to 10 or 100, there will be work isolation, conflicts and waiting. So in this process, as more and more people collaborate, the mode of branching will continue to change.
Analysis of 4 Common Branch Patterns
trunk development

When there are few team members (such as 1~2 people), the most common R&D model is Trunk- Based Development , also called trunk development.
Trunk development means that a trunk branch goes to the end, and there will not be too many conflicts during the development process. The code is required to be continuously integrated into the trunk, so there is no need to isolate the corresponding work during the development process. During the development process, all developers frequently submit and integrate frequently on the trunk. In this branching mode, the only fork occurs when the release is released. In order to isolate the release version, there is a release branch.
In this mode, branch isolation is not required, and information synchronization is ensured by continuous and frequent submission. When the number of people is relatively small and the entire engineering capability is relatively strong, this is the R&D model we recommend.
However, as the number of people involved in development increases, the probability of conflict in the main development increases greatly, and the requirements for engineering capabilities are also higher and higher.
Therefore, trunk development is not a panacea. The more people on the trunk, the greater the chance of conflict in code submissions, and the greater the risk of conflict resolution. If there are two people, even if there is a conflict, I know that it is only a conflict with another person. If it is 10 people, there will be a lot of problems.
In addition, in the trunk development, to keep the information synchronized, frequent and continuous submissions are required, and the force of each submission should be very small. For some features, it may only be done halfway, and then it needs to be submitted. , which needs to be isolated by means of characteristic switches, etc. For example, this is an unfinished feature, and its switch is made Off in advance, and then the corresponding submission is made, but the feature switch is essentially a branch.
The feature switch just pulls a branch in the form of code, but this branch can only be run when it is opened, and it is essentially a branch. If there are many feature switches, it will make the code very fragile to a certain extent, and it will be more troublesome to maintain.
When many people participate in the main development at the same time, the probability of code conflict is high, and there are many risks in the development of features, and everyone needs to be isolated from each other.

Git-Flow is to give what branch is needed, and everything has a very clear branch. For example, for integration, there is a develop branch, for development there is a feature branch, and for release there is a release branch, each of which is a different branch. Each type of branch has a definite purpose.
For example, feature branches are used to isolate work when many features are developed in parallel to avoid conflicts with each other. The release branch is used to isolate releases, so that there will be no conflicts between releases.
We found that this mode is well isolated, but in the process of information synchronization, it needs to be synchronized based on the frequent integration of develop, and do the corresponding cherry-pick or rebase among the branches. of.
At this time, we will find that there are too many branches, and a commit has to go through several branches from feature development to final release, in which branch circulation and merge rules are very troublesome.
So Git-Flow is not an elixir , too many branches increase the complexity of branch management. Also, if the life cycle of the Feature branch is particularly long, its merge time will also become very long. And the Develop branch and the Master branch exist at the same time, it seems that the significance of the Develop branch is not particularly great. In addition, it does not seem to be particularly meaningful to distinguish between Feature branches and hotfixes.
Therefore, although Git-Flow has added a lot of branches to isolate various tasks as much as possible, its information synchronization is very troublesome, and it is very difficult to manage these branches.

GitHub introduced a branch mode called GitHub-Flow, which is obviously much simpler than Git-Flow. There is no Develop, no hotfix, and no Release. When development is required, a Feature branch is pulled, and the Master is merged for release after development.
In this process, its isolation only occurs in the development process, and its information synchronization is achieved by continuously integrating to the Master and frequently pulling code from the Master. Its release process is based on the trunk Master branch, so there is no corresponding isolation during the release process.
At this time, it will bring another problem, that is, the Master branch needs to do continuous integration, and this branch is both the place of integration and the place of release. Once there is a problem after the integration, it will block all the work, can't publish and can't merge.
So GitHub-Flow is very simple and can be isolated accordingly, but if its own infrastructure or engineering capabilities are relatively weak, it will limit the frequency of your integration and release.

The difference between GitLab-Flow and GitHub-Flow is that there are Pre-production branches and Production branches in the release process, and corresponding branches are allocated based on different environments in the development, integration, and release processes.
After the integration is completed, it is on the Master branch, and the next step will switch to the pre-release branch. The version corresponding to Commit has reached the pre-release conditions. After verification on the pre-release, it will be synchronized to the Production branch, indicating that it has reached the release conditions, so it is a step-by-step Promotion process. Gradually from the integrated environment Promotion to the pre-release environment, and then Promotion to the production environment.
We briefly introduced some common branching patterns, and let's compare the pros and cons of them.
Comparison of advantages and disadvantages of common branch patterns

TBD has few branches, is simple to implement, and does not require too much comprehension cost to do. But it has high requirements on the maturity and discipline of team collaboration. Once someone does not follow the discipline, the trunk will become your nightmare, and it is difficult to do continuous integration and release well. Once it has a problem, everyone is blocked, which is the advantage and disadvantage of the trunk method.
The Git-Flow features can be developed in parallel, the rules are very complete, the responsibilities of each branch are particularly clear, and no matter how large the team is, there will basically not be too many problems, but it has too many branches, the rules are too complex, and the branches The life cycle is long, and merge conflicts will be more frequent. Especially Develop, Master is a long-term existence.
For GitHub-Flow, it can basically support what Git-Flow can support, but there is a problem here. Its integration can only be done in the Master branch, so it has high requirements for integration discipline, and the integration and release are in On a branch, once the integration branch is interrupted, both integration and release will be interrupted.
Gitlab-Flow is also developed in parallel, but the development branch still has the problem of long life cycle and the risk of merge conflicts. In addition, there is coupling between release branches. For example, between Prodution and Pre - Prodution , they are coupled based on Promotion, so they are also a way to interrupt and block each other, and many development branches, Prodution and Pre - Prodution , which also increases the complexity of branch management.
Therefore, we found that no branching pattern is absolutely good, and no one is absolutely bad.
There is a simple principle for branches, that is, control the number of branches, and integrate frequently in small batches . Controlling the number of branches is to achieve work isolation, but it adds too much management cost. Frequent integration in small batches can speed up information synchronization.
So a simple principle is to control the number of branches and frequent integration of small batches as much as possible from the perspective of maximizing productivity and minimizing risk.
Maximize Productivity : Everyone works in a common area. There are no branches other than a long-term, uninterrupted development trunk. There are no other rules, and the code submission process is fairly straightforward. However, every time the code is submitted, it may break the integration of the whole project, which will lead to the interruption of the project progress.
Minimize risk : everyone works on their own branch. Everyone's work is independent of each other, and no one can interrupt the work of others, thus reducing the risk of development being interrupted. However, this approach adds an additional process burden, and at the same time, collaboration becomes very difficult, and everyone has to merge their code carefully, even for a very small part of the overall system.
So how to design or choose a branch mode that suits you? In the next article, we will continue to share how different teams choose their own R&D model.

Click the link below to experience the cloud effect pipeline Flow immediately

Copyright statement: The content of this article is contributed by Alibaba Cloud's real-name registered users. The copyright belongs to the original author. The Alibaba Cloud developer community does not own the copyright and does not assume the corresponding legal responsibility. For specific rules, please refer to the " Alibaba Cloud Developer Community User Service Agreement " and " Alibaba Cloud Developer Community Intellectual Property Protection Guidelines ". If you find any content suspected of plagiarism in this community, fill out the infringement complaint form to report it. Once verified, this community will delete the allegedly infringing content immediately.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us