Enhancing Delivery Efficiency Based on Product Metadata - Alibaba DevOps Practice Part 18

This article is from Alibaba DevOps Practice Guide written by Alibaba Cloud Yunxiao Team

We have functional and performance requirements for deliverables to ensure the software delivery quality. These requirements can be reflected from the data generated during the delivery process, including code review data, security scanning data, and regression test results. All these types of data are included in the deliverables (or product). We call these types of data the metadata of the product.

What Is Metadata?

Metadata refers to the kind of data that does not change once generated by a system. As metadata is tamper-proof and traceable, it becomes the necessary basic data in the release process.

Why Is Metadata Important?

For example, from the perspective of the architecture, Alibaba’s middle platform applications rely on internal packages from many business teams, but the quality of these packages is hard to control. How can we solve this problem?

A method is to present data in a more panoramic and three-dimensional way from the dimensions of single applications and application dependency trees. Let’s take code review as an example. Products of middle platform applications contain many internal packages developed by business teams. However, when reviewing middle platform applications, you can only see that the version numbers of internal packages in the pom file have been changed. You cannot see any code changes. Reviewers need to see the code changes behind these version numbers and the information related to these code changes, such as related requirements, code detection results, and unit test results.

Most of the dependencies during runtime are determined when an application is built. Many runtime problems are caused by dependencies. The best way is to make the dependency tree generate values, thus transferring risks.

What Are the Main Types of Metadata?

In addition to the dependency tree obtained in the building process, we have other data, such as the quality data generated during testing and the security data generated during security scanning. All of the data is stored in the metadata center. Then, the data is used for automated checkpoint planting in the delivery process through the control policy center. This process usually includes three stages: visibility, controllability, and reliability.

Visibility means showing the data in the metadata center to users, including the risks and bottlenecks in the delivery process.
Controllability means users set rules to implement automatic detection and access control based on the problems they observed.
Reliability combines the capabilities of compliance scanning service and product warehouse to integrate data and rules for secure delivery.

The delivery efficiency can finally be improved from the stage of visibility to the stage of controllability and then to the stage of reliability. Meanwhile, metadata and rules continue to evolve and are gradually accumulated into a knowledge base, becoming one of the most important assets of the company.

Reliable Release Based on Metadata

The preceding figure shows the architecture of reliable release. Reliable release is metadata-centric and product-based. It continuously produces and consumes metadata and works with various automatic access control rules to ensure a continuous, reliable, and safe release.

Metadata is classified into two types: metadata of the package and data that has blood relationships with the package.

Package Metadata

The metadata of a package includes basic package information, such as building information, quality data, and security scanning data. It also includes third-party information written using the metadata protocol.

Metadata Details

Scoring System

A product, such as a book or a movie, can be scored. Let’s take JAR packages of an internal library as an example. The score can help users identify and reference the JAR package with the highest quality.

The score includes a system score and a user score. The score is specifically for one version of a product.

The full system score is about ten points, which are mainly given based on the following rules:

Whether the Package Complies with the Code Specification: One point is deducted if the package fails to comply.
Whether the Package Meets the Core Metric Requirements: One point is deducted if the package fails to meet the requirements.
Whether the Package is Released through the Official Platform: If not, points are deducted directly.
Whether There are Security Vulnerabilities: If so, points are deducted.
Dynamic indicators during runtime, such as startup time and memory consumption during startup
Business test case indicators, such as unit test coverage/success rate
The number of times the package is referenced. If the package is referred by more applications within a time period, points will be added.

The user score is the developers' evaluation of the product. Some products have high system scores, but the interface design is unreasonable with many dependencies. Developers can give a lower score to promote the continuous improvement of the products.

Metadata with Blood Relationships

A large product is composed of many smaller products. Products are involved in multiple relationships, such as dependency and combination. The blood relationships of a product refer to the number of dependencies of the product. We recommend using the dependency version with a high score.

Recommend Version Upgrades Automatically

How Can Metadata Be Used Effectively?

In addition to being used for access and exit on each node in the delivery process, metadata can be used for product governance. Product governance is based on two capabilities of metadata, insights and trends.

Insights tell the team leader how to manage and help the team leader find which products or sub-teams need to be managed first.
Trends tell the product quality trend after governance and show the team leader the strength and result of governance.

The following sections use a JAR package (an internal package) as an example.

Insights

Indicators of an internal package include dependency depth, the total number of dependencies, and the total number of versions. Team leaders can find their owner teams in the insight report, find the internal package that needs to be optimized, and perform targeted optimization activities.

How Can We Select Key Objects to be Handled?

Key objects to be handled are generally internal packages with a high level of dependency depth and a large number of dependencies and versions. As shown in the following figure, "com.taod:feent” is a typical object that needs to be handled first.

Find Key Products

How Can We Find the Culprit?

Due to the depth of dependencies, the total number of dependencies of an internal package refers to the total number of times this package is referred by applications in the last released GA version. Therefore, we need to select an internal package version that meets the requirements, as shown in the following figure:

Determine the Culprit

Next, check the details of the dependency tree and analyze it again. It is possible that our internal package is also referred to by an internal package that is a key object to be handled. We can check the dependency tree details to know whether there is such a package.

If there is such a package, handle it in the following ways:

Contact the person responsible for this package and optimize the package
Remove unused indirect dependencies of the package first

Trends

The trend report is mainly reflected from the following two aspects:

If the products are not handled, what will quality problems (such as risks) become?
If the products are handled, what is the result? If the result is not good, is the method not good, or is the strength of handling not enough?

Trend Analysis

The Application of Metadata in the Continuous Delivery Process

We will scan the product being released and its dependencies based on the faults that have occurred in the past and requirements, such as security and testing quality. Based on the metadata analysis of the product, the released risks are classified, and the users are prompted to know how to fix them.

Risk Notification in the Process

Risk Details

Risk Level

Risks are classified into high, medium, and low risks based on their severity levels. Among them, high and medium risks require special attention. The risk levels are defined below:

High

Local snapshot updates
Decrease of release version number
The new release package depends on the snapshot.
The JAR package that contains SO of X86 servers is used on non-X86 servers.

Medium

Dependencies from the same GA but with different version numbers are added.
Snapshots are added.
Too many lines of the code are deleted in the current version compared to the last version.

Risk Detection Process

A checkpoint will be planted at the last time of official release. aAcording to the risk repair costs, this can make risk detection as early as possible.

The Last Check

Summary

The core difference between code-based delivery and product-based delivery is that products are complete and cannot be changed. Thus, a continuous delivery system built based on products and their metadata can achieve reliable release, significantly improving release efficiency and reducing release risks.

Community

Enhancing Delivery Efficiency Based on Product Metadata - Alibaba DevOps Practice Part 18

What Is Metadata?

Why Is Metadata Important?

What Are the Main Types of Metadata?

Package Metadata

Scoring System

Metadata with Blood Relationships

How Can Metadata Be Used Effectively?

Insights

How Can We Select Key Objects to be Handled?

How Can We Find the Culprit?

Trends

The Application of Metadata in the Continuous Delivery Process

Risk Level

Risk Detection Process

Summary

Read previous post:

Read next post:

Alibaba Cloud Community

You may also like

Comments

Alibaba Cloud Community

Related Products

Alibaba Cloud Flow

DevOps Solution

ChatAPP

Secure Content Delivery Solution