By Zhang Ronghua (Ronghua)
Architecture constraints are classified into basic constraints and business constraints. Basic constraints are common software design principles in software engineering.
Responsibility constraints are related to modules, sub-modules, and models. In particular, core models and core main modules are stable within a certain period of time. Therefore, defining a constraint range for them helps to improve the efficiency of research and development (R&D) during this period of time.
Non-business functional constraints on architectures include stability, performance, and cost.
The constraints mentioned in this article are constraints on the logical architecture. As for business constraints, we must also consider constraints, such as our customer groups. Otherwise, the product design may not meet actual demands.
The preceding principles are the judgment criteria. What methodology can be used to make software comply with these principles? The answer is the design pattern.
Two important keywords are involved: "judgment criterion" and "implementation method". The judgment criterion is the software design principle, and the implementation method is the design pattern.
Skilled software designers should be familiar with common design patterns and principles. Most textbooks describe how to use design patterns within modules, but do not emphasize how to use design patterns between modules in a logical architecture to make the logical architecture conform to the software design principles.
When we design or derive a logical architecture, we mainly use design patterns to define the relationship between the modules in the logical architecture and the relationship between sub-modules within each module, to make them conform to the software design principles.
A challenge is how to use the design patterns to make the integration between modules conform to the software design principles, reducing the cost of maintenance and expansion. In architecture, the relationship between modules, between modules and sub-modules, and between sub-modules should comply with the constraints on software design. Domain modeling and design patterns are two methods for this purpose.
Leaving aside boundaries between modules and constraints on modules, only the design inside modules requires that software engineers grasp the software design principles and design patterns. When dependencies or boundaries between modules are considered, software design principles and patterns must be strictly followed. They merit further study, practice, thought, and summary. They are a required skill for architects, and also a foundation for designing logical architectures.
Deviation is acceptable when we start designing. We can gradually learn the mysteries through practices.
The following lists the constraints on a specific technology in a specific scenario:
In general, these constraints are constraints on physical architectures, which are described below. Different physical architectures are designed to solve different problems. Therefore, they must comply with different constraints of computer science and technologies. This is what architects should sort out and implement.
The preceding describes basic constraints on software R&D, which are rarely mentioned in coarse-grained modules. The constraint relationship between coarse-grained modules is refined based on business concepts. For example, orders, marketing activities, and commodities are core concepts and domains extracted from e-commerce. We can define these core concepts to determine their relationships and boundaries and form unified technical constraints on businesses.
Similarly, any domain should have such constraints, but such constraints are not immutable. When business understanding changes, such constraints also change, and business constraints are important to business development.
Let's take the structure of a country as an example. After ten years' study of history, I can roughly sum up the main structural constraints at the country level:
Different historical periods have different national governance structures and conventions. Take the system of Three Departments and Six Ministries as an example. The three departments are the primary modules, and six ministries are secondary modules. These modules can be further divided into villages, and finally, into households. A household is a typical domain model. The Zhou Dynasty in the Spring Period governed the state based on the Rite and Music System designed by Ji Dan. The Qin Dynasty in the Warring States Period governed the state based on Legalism (different from the laws that constrained people's livelihood) and finally unified six states. However, due to the restrictions of Legalism, the Han Dynasty adopted Confucianism for governance.
Based on ethics and morality, Confucianism was effective in the first 100 to 150 years of the dynasty. However, as its binding effect was reduced over time, the group of interest constantly blurred the boundaries between modules. As a result, the interest of some modules increased, while others decreased, and the dependencies between modules became out of order, affecting the interest of the whole architecture. Due to deeply overlapping interests, it is hard for a chief architect to revolutionize the architecture. Finally, the dynasty was overthrown by another dynasty. The new dynasty designed another architecture, and reset the boundaries and dependencies between modules, but morality and ethics were still the main constraints. This pattern was repeated for the 2,000 years after the Han Dynasty.
Anyway, appropriate architecture constraints help improve the architecture, whereas inappropriate constraints restrict the development of the architecture due to in-fighting and other circumstances.
Generally speaking, inter-module constraints are ubiquitous, and technical constraints are the easiest to understand. The fine-grained module constraints are easier to learn and understand, such as the software design principles. The coarse-grained module constraints are more abstract. Software engineers should have a deep understanding of the business, the organization, or society.
Software reuse includes reuse of the design, documents, and code. Code reuse is described in this section.
Code is refined for reuse for the following benefits:
We classify reuse from the perspective of business functions and non-business functions:
1. Reuse of business-independent content. Such content exists at every level of the infrastructure but does not belong to the logical architecture. The reuse of business-independent content is not the focus of this article, and will not describe how MVC design ideas are reused in different web applications.
Reuse of frameworks, such as Spring and MyBatis. The industry has never stopped studying frameworks.
2. Reuse of business-dependent content. Such reuse depends on abstraction capability and technical skills. For example:
Operators are also called metadata in UMP. Then, we can reuse and combine these operators to form new rules.
Reuse of processes. For example, each e-commerce platform has transaction processes, including information flow and capital flow. Then, can the transaction processes of Tmall, Taobao, and Juhuasuan be reused? If so, how should they be reused? Can we process the steps differently to reuse them?
Reuse of computing models and frameworks, such as reuse of the overlapping and mutually exclusive computing models in marketing, reuse of session packets, and reuse of test frameworks in specific businesses.
The specific reuse modes need to be considered in physical architectures. The following briefly describes some reuse modes.
1. Second-Party Libraries
The reusable code is refined into second-party libraries. The following two types are available based on the user and dependency:
Reusable APIs are transformed into services, which are exposed externally through interfaces by various technical means, such as HSF, SOFA, or HTTP. However, the technical means are not our focus. Instead, we should focus on the content of the services we want to expose, including the number of customers, their common demands, and the nature of the services. We should design the services accordingly and consider the interface specification. Understandably, designers emphasize the content infrastructure, such as the service container. However, we should still emphasize the service content and interface specification. Working on these two aspects helps business line designers' personal growth and success.
3. Presentation Components
We should also consider the design of reusable frontend presentation components, such as the reusable components of TMF.
The following describes the advantages and disadvantages of implementation modes of reusable modules in logical architectures.
This article does not describe the reuse of business-independent content. Instead, we discuss two cases of business-dependent cross-module reuse and the similarities and differences between the two cases:
Therefore, in the application logical architecture of a business line, the focus of reuse is to extract common features of models, processes, and computing models, and reuse these features in the form of second-party libraries or services. The following describes how to extract common features from logical architectures.
We can abstract and refine logical architectures from the following aspects:
I believe these are common practices for many designers. In this sense, refinement is a process of finding features in common:
All of this requires engineers to have a deep understanding of the abstract technologies for domain modeling and design patterns, as well as characteristics of computer technologies. Note: It is not enough to know domain modeling methods and design patterns. Abstract implementation varies with the characteristics of the selected technologies.
For reuse, engineers must master both abstraction technologies and computer technologies. To use traditional Chinese thought as a metaphor, abstraction technologies are Yin, and computer technologies are Yang.
Yin is always full of uncertainty. However, after further study, practices, and summary, we find that Yin also has a specific methodology, which must be learned through practices.
As for how to learn the methodology, we must start with learning Yang (computer science and technologies.) This is because, without knowledge of Yang, it is hard for us to understand Yin (abstraction and architecture technologies.) The ultimate goal is to reconcile Yin and Yang. Too much emphasis on Yin or Yang will lead to Yin-Yang imbalance.
After I joined the data department, I found that the Yin-Yang metaphor was no longer applicable. Now statistical analysis and machine learning methods (the two have both similarities and differences) are often used in our work. Therefore, our team should have a good knowledge of the following three subjects:
In the last year, I spent a lot of time on statistical analysis and was asked why I gave up domain modeling. I did not give up on domain modeling. Domain modeling is an important aspect but is not the only method of abstraction and architecture. It is a necessity for engineers. Other methods include deduction and induction and top-down decomposition. Statistical learning and computer science and technologies can better solve engineering problems. So, I learned statistical analysis and learning, which are skills that engineers in different fields must master.
Reuse is a very important science in software. It involves abstraction technologies and computer technologies. Abstraction technologies depend on the understanding of businesses, which requires long-time practices.
Sometimes, even though a software process can be abstracted technically, the abstraction cannot be implemented due to organizational structure problems. Alternatively, organizational structures require frequent adjustments due to instability. As a result, the return on investment (ROI) is small and therefore refinement fails. Such situations are not described in detail in this article.
Layer is a universal concept that almost every new engineer learns about. It is referred to as "tier" in some books. The term "layer" is used in the open system interconnection reference model (OSI model). Some articles use "tier" to refer to architecture. To confuse things even more, Wikipedia uses both "tier" and "layer" interchangeably.
Most textbooks divide architecture into three classic layers:
Sometimes, a service layer is added. Layering is very common in the project architecture. Almost all project architectures are layered. Therefore, when it comes to architectural layering, the first thing we can think of are the layers in a project architecture.
An important purpose of layering project architecture is to constrain the code architecture and make it well organized.
The logical architecture layers we are talking about does not mean the project architecture layers. Why? First, let's look at the characteristics of a logical architecture:
The two layering modes have the following essential differences:
You may think of layering a large architecture (such as the logical architecture of a business unit) like this: classifying the modules closest to users as the presentation layer, all the modules in the middle as the business layer, and the modules at the bottom as the presentation layer. Of course, we can do this, but such practice is meaningless and cannot achieve what we expect from layering.
File systems or network protocols can be encapsulated into different layers, as shown in the following figure.
In the preceding figure, each module has to solve different problems in different stages, and can be further divided into sub-modules. The following describes module layering in a logical architecture.
Generally speaking, upper-layer modules depend on lower-layer modules.
Modules at the same layer may sometimes depend on each other. We can mark data streams or call streams with arrows at the layer. However, these are not the key.
The key is that we must always keep in mind what problems modules at each layer have and what problems modules at each layer want to solve. For example, in the file system and protocol layers of an operating system, modules at the bottom layer are used to precisely control the hardware. Modules in the middle are exposed to users, which are easy to use. Modules at the upper layer are used to solve problems of applications in specific domains. Modules at different layers are abstracted differently to solve different problems.
For many people and in some books, layering is equivalent to the layering of the project architecture. However, the two concepts are different. If we are talking about architecture, we should not focus on the layering of the project architecture.
For example, books about domain modeling frequently mention the service layer, domain layer, and repository layer. We should consider the positions of such layers in the architecture and when to apply these concepts.
As shown in the preceding figure, a fine-grained module can be divided into the service layer, model layer, repository layer, and integration layer based on pure technical responsibilities. Note: The layers of the project architecture are not related to the specific business logical architecture. They are more technical and aim to organize and manage the code at a high level.
Normally, after the system model is produced, we should consider its settings, dependencies, and constraints. Unfortunately, many books focus on the service, model, repository, and integration layers in domain modeling.
Such a practice in these books does not conform to the actual workflow. After the domain model is completed, we should first consider the position and responsibility of each module in the architecture, the relationships between sub-modules in each module, and between modules, as well as the overall constraints (refer to the architecture definition at the beginning of this article.) Specifically, we do not draw any layers, such as service, model, repository, and integration, in the logical architecture diagram.
Inside fine-grained modules, the project architecture layering is part of the infrastructure architecture, and perhaps the most important task. However, for the entire application of logical architecture, this is not the core part, or the most important part to consider. In other words, even if you do not divide the architecture into service and model layers, the division of responsibilities among modules in the application logical architecture is not affected.
In some projects, the application logical architecture is clear. However, during the design of the physical architecture, all the modules in the logical architecture are placed in one service package, which is not suitable. The modules in the logical architecture are not implemented at all.
Therefore, in the projects of our department, we never place layers, such as service, model, repository, and integration, at the top level. Instead, we implement the logical architecture design, so we can see the architecture of our specific applications and application packages.
Again, the layering in a logical architecture does not refer to the layering of services, models, repositories, and integration, but refers to the layering of functional modules. If we do not understand the business, business conceptual modules, or business conceptual architecture, we can hardly design a reasonable application logical architecture (including the layering of modules in the logical architecture.) It is not feasible to layer the logical architecture without considering the business characteristics.
After we determine the responsibilities of modules, we should determine the dependencies between modules and define rules and technical implementation methods for interfaces through which the modules are exposed. For example, if modules are exposed through a RESTful interface, what rules should be used? If modules are exposed through an internal service interface, what rules should be used? This article does not describe the rules in detail. For the former, you can refer to the open interfaces of major platforms. For the latter, you can refer to the rules of internal service calls of each business unit. If no rules are available, we should create unified rules.
Here, I will introduce a new concept: a logical architecture granularity tree.
Two-dimensional architectures are discussed above and can be deducted by using certain methods and refined into horizontal and vertical structures. As I mentioned at the beginning, architecture can also be three dimensional. Modules in a two-dimensional architecture have sub-modules or parent modules of various granularities.
To make a metaphor, the GodMars in GALAXIR in the following figure is a large architecture.
It consists of many small modules, such as the material spacecraft and probe, and each small module is composed of many basic modules. The GodMars in GALAXIR has only three layers: basic components, modules, and the final GodMars, which represent different granularities. Complex business architectures have more granularities and layers. The number of granularities and layers depends on when the ROI of N R&D resources in a module is the maximum. The number N should be a stable value under technical limitations at a certain stage.
A module can be abstracted into a granularity tree:
The logical architecture granularity tree has three principles:
The preceding tree structure can only describe the relationship between a module and its parent module or between a module and its sub-module. It cannot completely describe the relationship between modules, that is, whether the modules are at the same layer (the horizontal and vertical structures in the application logical architecture as mentioned above.)
What kind of graphics can be used to vividly express the relationship between a module and its parent module or sub-module, and the relationship between different modules? I have been thinking about this problem for a long time, and I have finally come up with the following graph:
The architecture in the figure has three layers, but an actual architecture may have more or fewer layers. The more layers the architecture has, the more sophisticated knowledge is involved.
Here is a serious question: Do frontline engineers not need to consider the logical architecture; of course not. If a module in the architecture has sub-modules, parent modules exist. Then, engineers must constantly iterate the module design and focus on the parent modules and parent modules of these parent modules. As the module granularity increases, engineers' responsibilities and required capabilities also increase over time.
The following architecture module granularity tree does not contain dependencies between modules. In this article, I just want to briefly introduce how to implement a module in the physical architecture. I will describe the process by using specific cases in subsequent articles.
Modules with different granularities in the module tree can be implemented into physical architectures in the following forms:
What is the reason? At different business stages, the logic complexity in modules varies, and the granularity of a module also changes.
The following example shows the changes in the implementation of module trees with small, medium, and large logical architectures, respectively.
The preceding figure shows a newly created e-commerce website. The website has all the required modules, but the logic in the modules is simple. These modules exist in an application in the form of packages. However, the responsibility assignment of modules is reasonable, and the granularity meets the development requirements. Therefore, it is easy to divide the application into distributed applications. As the business develops, applications with unclear module responsibility assignments are hard to divide into distributed applications.
The e-commerce website at this stage has mature marketing, commodity, and transaction modules. Thanks to the previous responsibility assignment of modules, the architect can quickly divide multiple top-level packages in the initial stage into multiple applications.
At this time, it is already a large e-commerce website. The marketing module has been divided into multiple applications. Thanks to the reasonable module responsibility assignment on the previous stage, the architect does not need to pay much attention to the governance of the logical architecture when improving the physical architecture. Instead, the architect can concentrate on the construction of physical and infrastructure architectures, such as zone-disaster recovery and active geo-redundancy.
I'm not certain whether a mid-end is a giant architecture. When all stable core modules in an e-commerce business are abstracted, the final result may be a domain model, a system process converted from a business process, a computing model, or an algorithm. Then, the changing modules (frontend) can be rapidly iterated based on this big core concept. The following graph helps us understand this process:
When we design a mid-end that is used as a technical product for the frontend, we must consider the following aspects:
Judgment criteria include whether multiple business lines have duplicate process abstractions, duplicate domain model abstractions, duplicate computing models (data structures and algorithms), and duplicate supporting facilities and whether these are built repeatedly.
To cope with changes in the evolution process, we must be skilled in learning new knowledge, communicating with others, coordinating resources, leading and influencing others, evaluating others, and choosing the right people. As the organization size increases, the required capabilities also increase.
It sounds intangible, but the basic law remains unchanged.
We need to determine the implementation form of a logical module from the following dimensions:
How many team members are required when teamwork efficiency is the highest? As technologies continue to develop (such as the emergence of new containers) or the complexity of services changes, the number of people that are required to maintain an application also changes. This number should be adjusted based on the supervisor's experience and the problems that occur. Currently, my team has about five members, and I will divide a large domain model into sub-modules to solve problems individually.
The dependency strength should be considered. Core modules are separated from non-core modules.
A lot of performance metrics should be considered, including queries per second (QPS), technical implementation inside a module, the use of multiple threads or coroutines, capacity evaluation, and stress test. As long as threads are considered, we need to consider the following aspects to derive the maximum QPS and the synchronous or asynchronous structure:
Should the wait time or CPU time be reduced to decrease the response time (RT)? We should consider all aspects, the network, servers, and storage. For example, if we know the knowledge related to the important formula BDP = BD × RTT, we can understand the theoretical basis for many methods of network optimization.
Efficiency, stability, and performance are the three main factors that affect how physical architecture is implemented.
At the beginning of the article, I gave an official definition of the word "architecture". Then, I classified architectures (according to my own knowledge) into four categories: product functional architecture, business architecture, application logical architecture, and application physical architecture.
Different architectures explain different issues:
The product functional architecture emphasizes the capabilities of functional modules and targets the end-users of a product.
The business architecture is based on the analysis and understanding of a business, helping us better build a product. It targets product designers and technology designers.
The application logical architecture emphasizes the responsibilities of logical modules during R&D and targets developers and architects.
When the target audience and purposes are determined, it is very important to correctly analyze what kind of architecture should be used to illustrate our intention.
Systems of all sizes, including the management information system (MIS) and the whole Alibaba system, can be explained by architecture. Mid-ends, backends, and frameworks in the architecture are outputs of the architecture method. Systems of different sizes can be abstracted similarly.
After the architecture is generated and the businesses iterate, if the architecture is not managed, the responsibilities, dependencies, layers, and constraints of the modules are not clear. In addition, stability, performance, and cost are affected. The longer the disadvantages persist, the harder they can be solved. Sometimes, we have to start over.
To avoid redesigning the architecture, we need to correct the architecture constantly from the bottom up and redesign some modules. For correction, the importance of induction and deduction should be emphasized. Here, induction and deduction mean summarizing core items to be abstracted.
The key to bottom-up deduction lies in deduction and induction. The deductive method should be used for bottom modules and the inductive method should be used for top modules.
When should we apply these two methods? When the goal (such as the business goal) or conclusion is coarse-grained, we need to decompose it by using the top-down analysis method. A similar top-down approach is often used when planning the future to produce macroscopic conclusions.
If the product solution is clear, the programmers need to understand the business requirements and derive architecture based on the product solution. In this case, the programmers generally use the bottom-up analysis method, which is also applicable to domain modeling.
When the bottom-up analysis method is used, the method can be summarized with two keywords:
Deduction means logical deduction, which is more applicable to bottom modules. The process of deriving service models from use cases is deduction. The process of deriving system models from business models is also deduction. The process of deriving stability measures to be implemented from existing problems is also deduction.
Induction here is to classify things by a certain dimension. It is more applicable to top modules. Induction can be used to divide problem space modules. Induction may also be used to design logical architectures.
For example, we can conclude countermeasures before, during, and after an event based on existing stability problems, which is a time-based induction process.
The following describes deduction in detail:
In general, the more the derivation layers are, the more the logical branches are, the more thoroughly designers understand the problems, and the more competent the designers are.
To make a metaphor, as for trees of the same variety, the length and coverage of the roots of small trees are different from those of large trees. Similarly, different people have different logic capabilities.
Deduction and induction approaches are often used in our daily work without our awareness. This article may help us understand how we have used deduction and induction approaches in our previous work and how to improve the methods we used before.
In addition to general bottom-up derivation methods, we must also understand the skills and principles in the computer field to get the appropriate results:
In this figure, there is a strict logical path. The input of each step is the output of the previous step. More importantly, the steps in the figure are arranged in order. We must design the architecture from top to bottom and take the business into consideration.
Above, I talked a lot about the problems of space domain models because space domain modeling is actually performed at the analysis stage. If we do not analyze the requirements correctly, we are very unlikely to design the architecture correctly.
If we obtain the correct analysis results at the analysis stage, we can obtain the reasonable and correct application logical architecture at the design stage based on the reasonable and correct methodology.
Domain modeling is an important method of abstraction and architecture design, but it is not the only method. Induction, deduction, and top-down deduction are also important methods of abstraction and architecture design.
This methodology has the following key points:
1. Architecture problems are common in our work. We should identify and define the problems in the architecture.
2. Outputs of business conceptual models are deduced by specific methods.
3. Outputs of the business conceptual architecture are induced by specific methods.
4. Outputs of system models and data models are deduced by specific methods.
5. Outputs of the application logical architecture are induced and deduced from the preceding outputs. The outputs include:
6. The induction and deduction methods used to derive the application logical architecture require a lot of specific knowledge.
The most important thing is that this process is constantly iterated. Architectures should always be iterated. Some architectures are constantly restructured and adjusted during the iteration and are effective for a long time. Some architectures lack such a mechanism and eventually disappear.
Any suggestions are welcomed. Thank you.
Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.
Alibaba Clouder - January 4, 2021
Alibaba Clouder - January 29, 2021
Frank Zhang - November 1, 2019
Alibaba Developer - March 8, 2021
ApsaraDB - November 17, 2020
Alibaba Developer - September 22, 2020
More Posts by Alibaba Clouder