Community Blog Orchestration-Oriented O&M - Alibaba DevOps Practice Part 23

Orchestration-Oriented O&M - Alibaba DevOps Practice Part 23

Part 23 of this 27-part series discusses the benefits of orchestration-oriented O&M tools.

This article is from Alibaba DevOps Practice Guide written by Alibaba Cloud Yunxiao Team

Alibaba's application O&M platform has supported the online deployment, scaling, resource management, and O&M change operations for most of its applications, including more than six years of development. A rich and stable set of O&M atomic services has been gradually developed. We proposed an orchestration-oriented O&M solution to maximize the value of these atomic services and build the middle platform capabilities of the application O&M platform.

Orchestration-oriented O&M refers to the simple orchestration service in which users (such as PaaS services and development, O&M, and operation roles) flexibly assemble multiple atomic components to meet business needs. Different business processes are constructed to handle a complete O&M requirement. It can help users standardize, manage, and execute automated O&M operations, define required operations in the form of templates, and run the templates. It improves the overall O&M efficiency, enhances the security of O&M operations, and avoids manual O&M errors.

Major Pain Points

In the application O&M field, most of the practices are based on workflows and work order management to implement corresponding O&M change operations. However, traditional O&M workflows have certain shortcomings in maintenance costs and scalability, lacking effective process lifecycle management methods.

These problems can be divided into the following three categories:

  • As businesses continue to develop and scenarios are enriched, O&M businesses become more complex. Non-general personalization requirements often emerge, such as adding a third-party data synchronization step in the scale-out process or performing different O&M procedures for different environments for the same change type. These requirements increase the platform implementation and maintenance costs.
  • The underlying process engines provide limited support in the O&M field, and the capabilities, such as component orchestration and process control, are not easy to scale. It is difficult to ensure performance, stability, and security in large-scale scenarios.
  • Traditional O&M platforms do not have unified and standardized integration capabilities. This makes it difficult for them to empower other PaaS O&M products because they lack middle platform capabilities and value penetration. In addition, developers and O&M engineers lack the means to design and manage custom O&M operations.

Core Concept

The core concepts of orchestration-oriented O&M are service components and O&M orchestration. We register O&M atomic services as components based on platform specifications and host them to a unified component pool for maintenance and management. Users can select components in the component pool as needed, assemble them into O&M workflows in a proper way, and execute to complete the expected O&M change tasks. Orchestration-oriented O&M aims to create an efficient, stable, and secure O&M platform.

Technical Ideas

Business Architecture


The architecture has five layers from bottom to top. The first layer is the process engine and the container engine, which are the executors of atomic services. The second layer is the definer, which defines different types of O&M atomic services. The third layer is the registrant, which registers atomic services as components. The fourth layer is the process orchestrator, which provides the core orchestration capabilities. The fifth layer mainly provides scenario-based orchestration capabilities, providing additional feature support for different scenarios.



The integrated services can register a Rest API with API Gateway to expose the service externally through a unified gateway. The gateway must implement standard authentication and authorization policies and API lifecycle management, circuit breaking, and throttling. In addition, APIs registered to the gateway must be registered in the component pool of the job platform. If the process engine is introduced in the integrated service, the corresponding atomic components must be directly registered in the remote component pool. Then, the convergence and unified management of all atomic components can be completed through the job platform. Based on this, the business side can select the corresponding components from the component pool and assemble them as needed and set the process input through the custom form function to trigger the process at the same time. When a process is executed, its execution engine subsystem performs remote scheduling and drives the final service provider to run relevant functional components.

Core Functional Components

  • Orchestration Engine: Use the process engine, form engine, rule engine, and script engine to drive O&M construction and execution
  • Middle Platform Gateway: Standardize access standards for components. Integrate and provide a wide range of O&M atomic components through a unified gateway for third-party or platform orchestration.
  • Security Assurance: The orchestration-generated business processes integrate with the approval process, security risk control, unattended operations, and various inspection capabilities by default. A comprehensive security mechanism for O&M changes is provided.
  • Support Service: Business support services are provided, such as business owner data, message center, notification center, task center, and permission center.



Key Capabilities

The key capabilities are listed below:

  • Quick Function Extension and Construction: Orchestration-oriented O&M provides a wide range of basic O&M components and public templates in common O&M scenarios. Users can create a template by copying and modifying the public template to meet specific O&M requirements, reduce the complexity of template writing, and improve the overall O&M efficiency.
  • Quick Integration with Third-Party O&M Capabilities: API Gateway allows users to package third-party O&M capabilities as components for O&M orchestration.
  • Ability to be Integrated by Third-Party Platforms: Third-party platforms can manage templates and processes using the APIs provided by the O&M orchestration center. They can monitor execution processes by subscribing to process events.
  • O&M Scripts and Files Management: The O&M platform manages users' O&M scripts or files in a centralized manner, including enabling/disabling, version management, and authorization management of scripts or files.
  • Visualized Execution Process and Execution Results: Users can see the complete execution process and results by providing a visual execution cost process.


  • Visually view the execution details of each task
  • Clearly see the execution process, order, and jump of errors



  • Extension of Existing O&M Changes Businesses: Adjust the existing O&M change operations of the O&M platform to meet the specific requirements of the business department.
  • Defining New Change Types: Some change types are not provided by the O&M platform, such as O&M in IoT scenarios. The business side can register atomic components in the platform as needed and create a new type of O&M process.
  • Batch Host O&M: Select a batch of hosts and run a series of O&M scripts or commands in scheduled order
  • Scheduled Inspection Task: The timing component can be used together with the custom process to perform data inspection and generate result reports of online resources or services in different dimensions.
  • O&M Orchestrator: Users can use the platform to orchestrate APIs as custom HTTP components and develop the O&M functions. This reduces the development workload.
  • Host O&M: Users can clean logs and manage components of the host through the host O&M component.

Let's take application scale out as an example:

1.  Visually orchestrate the application scale out template:


2.  Submit the form and execute the application scale-out template:


3.  Query the execution progress and results:



Orchestration-oriented O&M provides efficient, flexible, and stable O&M services for enterprises. Based on the business management requirements of enterprises, visualized user orchestration interfaces, control elements, and mature and stable module components can be used. The orchestration-oriented O&M tools can help teams quickly build light-asset, high-performance, and personalized IT O&M tools, assisting the transformation of traditional O&M and accelerating enterprise digitalization.

0 0 0
Share on

Alibaba Cloud Community

900 posts | 201 followers

You may also like