- manual O & M-mechanical repeated O & M, underground efficiency, frequent errors
- tool assistant-unable to operate in batches, lack of management of batch operation status, and difficulty in locating errors
- rapid business development, lack of O & M accumulation, lack of O & M specifications
- multi-person O & M lacks permission control and centralized logs, which cannot be audited.
- Excessive O & M permissions increase the risk of errors"
- phase 1: manual O & M. It completely relies on manual operation and maintenance. Sometimes, you can tap a few lines of commands in the command line, and sometimes you can use the mouse to operate on several different web pages. Basically, it is guided by the "predecessors, then I started my own journey. I felt uneasy at the beginning of the operation for fear of" damaging "the system. After the operation is skilled, it is easy to take it lightly. It is best to make mistakes and trigger faults.
- Phase 2: Some O & M scripts and gadgets are gradually accumulated to help complete a single O & M task. Compared with the first phase, the efficiency of completing a single O & M task is much higher. However, when processing a batch of tasks, they are still unable to coordinate and coordinate the status of each task. Script writing is not strict enough, and errors often occur, resulting in the inability to continue using scripts and tools, you need to return to manual O & M.
- Phase 3: Complete automation system. Compared with the second phase, tools and scripts are replaced by system-level automation software. The automated system provides complete functions such as batch operation, status monitoring, logging, and identity authentication. Such systems can complete pre-designed automation functions. As more and more tasks are required, the system needs to be continuously upgraded and maintained. -- To facilitate Alibaba Cloud users, Alibaba has built an automated O & M platform on the cloud-OOS.
alibaba Cloud operations Orchestration Service (Operation Orchestration Service, Short OOS) IS a comprehensive, free cloud automated operations platform, provides operational tasks management and execution. Typical scenarios include event-driven O & M, batch O & M, scheduled O & M tasks, and cross-region O & M. OOS provides approval and notification for some important O & M scenarios. OOS allows you to standardize O & M tasks and implement the advanced concept of O & M as Code (Operations as Code). OOS supports cross-product use. You can use OOS to manage cloud products such as ECS, RDS, SLB, and VPC.
visual execution process and results
by providing a visual execution process, you can see the complete execution process and results, including:
- you can view the execution details, parameters, and output of each task.
- You can clearly see the execution process, sequence, and error redirection.
- You can see the complete execution process from the beginning to the end.
Free fully managed automation
provides fully managed automated execution, that is, server-free (Serverless) automated execution. You do not need to consume or use your computing resources (such as ECS instances) during the execution process to meet the automated O & M requirements of entrepreneurial companies, small and medium-sized enterprises, and large enterprise customers. In the fully automated mode, you do not need to be manually guarded, allowing you to focus more on high-speed business growth.
Efficient batch management
in traditional scenarios, running multiple tasks is more complex than running a single task. OOS allows you to manage the progress in real time, perform health statistics, and quickly locate errors, this improves the overall O & M efficiency.
Complete authentication and auditing
you can continue to use the familiar resource access management and user rights management system (RAM) to manage OOS, all operations performed on other cloud products through O & M orchestration support authentication and auditing. You do not need to worry about the security of your operations.
OOS provides easy-to-use template building capabilities. You do not need to build templates from scratch. OOS provides quick integration of popular cloud products. Cloud product actions help you quickly build templates, reduce template writing difficulties, and improve overall O & M efficiency. OOS provides common templates for common O & M scenarios. You can quickly copy and modify a similar template to meet your unique O & M needs.
Cross-region and multi-region O & M capabilities
for business needs, you may use resources in multiple regions. Common O & M operations can only be performed in one region. OOS provides cross-region and multi-region O & M capabilities. In multi-region O & M, you can use the OOS task loop to quickly cover multiple regions. During cross-region O & M, you only need to specify the RegionId of the corresponding Action.
Standardized O & M tasks (Operations as Code)
provide Daily O & M tasks as templates, and manage templates in Code mode. From Creation to review, and then to the production account, subsequent O & M tasks are only selected from the standard template to ensure the security of O & M actions, as the source code specifications, this is the best practice for O & M (Operations as Code).
O & M permission convergence (authorization)
the permission management of O & M personnel is very complex. Too large permissions mean too high risks, and too small permissions cannot complete O & M operations. How can O & M personnel complete O & M tasks, at the same time, it avoids unexpected operations. OOS provides the authorization mode. Administrators with high permissions write OOS templates and configure fixed roles to complete the authorization. You can then grant the permissions to other low-privilege O & M personnel to complete the O & M task and avoid the risk of high permissions.
common scenarios of OOS include event-driven operations, batch operations, image update, O & M scenarios requiring approval, scheduled tasks, and cross-region and multi-region O & M, you can also customize various templates based on your actual scenario.
when an event occurs, an operation and maintenance action is triggered. For example, when the CPU usage of an ECS instance reaches 85%, it automatically restarts to prevent service interruption. Event-driven scenarios support active O & M, eliminating human factors and improving O & M efficiency.
to run O & M commands in batches, you need to perform routine operations on multiple targets (such as ECS instances) to ensure normal and smooth business operations and maintain healthy business status.
For example, you can check the remaining disk space in multiple ECS instances. First, select the list of instances to be checked. (Multiple options, such as name matching, tag grouping, and resource group grouping), and then run the cloud assistant command to run the hard disk check to view the results.
to ensure that the running environment of the ECS instance is always secure, including installing the latest patches or updating the dependent components, you can use OOS to update the image from a source image, A new image is generated for testing and production.
O & M scenarios to be approved
in many scenarios, approval is required to ensure that operations are secure and meet expectations. By adding an approval action (ACS::Approve) to the template, manual approval can be performed before the operation and maintenance actions are actually executed to ensure the necessity of the operation and maintenance actions and avoid wasting and misoperations.
perform the defined O & M actions regularly. For example, in a test scenario, you need to clear the OSS files generated by the test under an account. You can create a template and run it every morning, to ensure that the test environment is a brand new environment and avoid interfering with the next test results.
cross-region O & M scenarios
multi-region deployment has become a common architecture for high availability (HA) architectures. As a result, cross-region O & M becomes more and more complex. OOS can help you solve cross-region O & M problems. You can define O & M operations in different regions in one template to implement cross-region O & M scenarios.
Multi-region O & M scenarios
when you use resources in multiple regions, you often need to synchronize them to maintain their consistency. For example, you can enable the log function by performing the same operations on buckets in multiple regions of Object Storage Service (OSS).
- OOS and ROS define templates by configuring YAML or JSON files-orchestration. The orchestration content is resources, that is, ROS. The orchestration content is O & M actions, that is, OOS.
- OOS and ROS have similar configuration syntax.
- OOS and ROS are free of charge. Fully managed cloud products are free of charge, but resources created through OOS and ROS are charged according to their respective requirements.
- OOS and ROS are official cloud products of Alibaba Cloud. They provide unified identity authentication and security audit. Compared with similar third-party products, the cloud products provided by Alibaba Cloud will provide you with more integrated and consistent products.
ROS focuses on planning and deployment scenarios and is oriented to the final state of cloud resources. Before deployment, you need to define the infrastructure form and declare it through ROS, ROS is responsible for creating or updating infrastructure resources (cloud resources) for you and achieving the desired final state. For example, to deploy a website, you need to deploy 10 ECS instances, one SLB instance, and one RDS instance. You must declare in ROS that 10 ECS instances, 1 SLB instance, and 1 RDS instance. Then, create ROS Stack to create these resources. After the creation is completed, there will be 10 ECS instances, 1 SLB instance, and 1 RDS instance.
OOS focuses on action and O & amp; M scenarios and is oriented to the O & amp; M process. It only emphasizes the results of this task. Assume that there is an O & M scenario where two ECS instances need to be added to a running website due to increased traffic, therefore, you can create an operation manual for adding an instance (OOS template). This operation manual only focuses on this task, creates an instance, and hangs it under SLB. OOS does not care about the number of ECS instances under the SLB instance. It only cares about the completion of the O & M task for the two additional instances.
The locations of OOS and ROS in DevOps
the picture at the beginning of this article is shown in the DevOps System, including the whole process of planning, coding, building, continuous testing, publishing, deployment, O & M, and monitoring. The deployment can be divided into environment deployment (that is, infrastructure) and application deployment. ROS focuses on environment deployment. Common environment deployment includes ECS,VPC,SLB, and RDS. OOS focuses on O & M. O & M is different from deployment. Deployment has a relatively clear concept and scope, while O & M has a relatively vague concept and a larger scope. O & M also includes O & M in non-DevOps, and even daily IT O & M. OOS mainly supports cloud products and their resources, including ECS instances and their internal GuestOS. Perform internal O & M of OpenAPI through the GuestOS of cloud assistant, such as viewing disks, installing and uninstalling agent, and modifying configuration files.
links to OOS help documents OOS customer support DingTalk group: 23330931
permissions to play O & M orchestration service: Assume Role + Pass Role
O & M orchestration series-update ECS Image O & M orchestration series-automatically TAG ECS instances O & M orchestration series-copy files from instances to OSS O & M orchestration series-add instances to SLS machine Group O & M orchestration Scenario Series-check MFA function status new features of Alibaba Cloud O & M orchestration: clone multiple ECS instances with one click