Cost Optimization Practices on the Cloud

01 Necessity of cost control on cloud

As shown in the figure above, the data of Flexera's 2022 cloud status report shows that the surveyed enterprises think that 32% of their cloud spending is wasted, up from 30% last year.

According to the cloud MSP service development survey report of China Academy of Information and Communications, cost optimization has become the primary demand of enterprise cloud management. Cost control on the cloud is a difficult problem and pain point for many enterprises.

The cost management systems on and off the cloud are very different. Under the cloud, enterprises purchase IT assets at one time and pay the amount at one time. On the cloud, the enterprise's IT assets become on-demand and pay-as-you-go.

At present, enterprises face great challenges in cost control on the cloud. First of all, cost management on the cloud requires multi-department collaboration. For example, finance/procurement, technology/operation and maintenance, product/business and other teams cooperate. Each department forms a real-time decision-making system in the process of cooperation. Because cloud resource billing methods are diverse, enterprises need to have a deep understanding of cloud resource billing.

Secondly, enterprises need a timely cost reporting and monitoring system.

Finally, enterprises need to complete multi-cloud scene adaptation. In the face of multi-cloud scenarios, each cloud has different billing methods, and enterprises need to adapt specifically to control their own costs.

The method of cost control on cloud for enterprises is mainly divided into four modules. First, choose the appropriate payment method; Second, select appropriate resource specifications; Third, improve resource utilization; Fourth, cost analysis and monitoring.

02 Payment method and resource specification selection

Take Alibaba Cloud ECS as an example. ECS mainly has three product types, namely, pay-as-you-go, monthly package and preemptive instance.

These three product forms correspond to different product functions and are essentially the balance of economy, flexibility and certainty.

The flexibility of pay-as-you-go is very high. It can be created, released, upgraded, and downgraded at any time without any restrictions. However, its disadvantages are relatively expensive and poor economy.

The monthly package is a prepayment mechanism, which is economical and cheap. However, its flexibility is poor and resources are bound to finance. The preemptive instance is very economical, and the price is very cheap, but the certainty is poor.

Preemptive instances have two characteristics.

First, it is cheaper than the price of pay-as-you-go. The minimum price can be up to 10%.

Second, the certainty is poor. It may be released by the system at any time after stable operation for one hour. The use of preemptive instances is suitable for stateless task-based scenarios and can significantly reduce costs.

Next, let's introduce the reserved instance coupons. The resources of monthly package are bound to finance, and there are many restrictions on refund and reduction. The pay-as-you-go ECS plus reserved instance vouchers mainly solve the flexibility of monthly package.

When the ECS instance specification matches the reserved instance voucher, the pay-as-you-go ECS will not be billed, but will only charge for the ECS reserved instance voucher.

Because the reserved instance voucher is the concept of prepayment or locking time, it is cheaper than pay-as-you-go and greatly reduces costs. Reserved instance coupons include zero prepayment, partial prepayment and full prepayment.

The pay-as-you-go ECS can be billed every hour and released at any time. It is also a way of zero prepayment. However, the zero prepayment of reserved instance coupons is different from this. It means that users cannot refund or unsubscribe at any time after purchasing a certain amount of time. When users promise to use it for one year and pay for it every hour, they must use it for one year.

Partial prepayment means that the user pays a part in advance, and the system will deduct the remaining amount every hour. Full prepayment is the same as monthly package, and all the money is paid in one time.

In order to solve the problem that the reserved instance vouchers are not flexible enough, Alibaba Cloud has launched a savings plan. Compared with reserved instance vouchers, it can meet the demand of resource purchase bill deduction in DevOps, containerization, multi-specification family, multi-region deployment and other scenarios.

The savings plan is divided into two types, namely general type and ECS type. There are no restrictions on the universal model, which can directly deduct the pay-as-you-go bill of ECS. ECS models have a small part of restrictions, that is, restrictions on regional specification families. At the same time, the savings plan supports multiple products, such as ECS, ECI, RDS, and so on.

An e-commerce user has been using monthly package all the time. The demand for resources is unstable and the demand for computing power changes rapidly. There are hidden costs in upgrading and refunding. The total price is stable and the total cost is reduced by 9% after switching.

A certain online education user has different resource requirements in different periods of time. The user uses one group of resources in the daytime and another group of resources in the evening. Nearly half of the time of the monthly package resources is wasted. Save the plan, pay as you go, and share the benefits of cross-specification family discounts. After switching, the total cost will be reduced by 42%.

A game user has a high demand for flexibility, and the resource pool must be built based on pay-as-you-go, resulting in a very high pay-as-you-go fee. The user directly purchases the savings plan without any modification, and the total cost is directly reduced by 56%. The resource guarantee side is equipped with capacity reservation, without cost increase, and the success rate of volume creation is 100%.

To sum up, it is recommended that users combine multiple payment types. Because of different payment methods, there are different use scenarios.

Preemptive instances support stateless and task-based business loads. Pay-as-you-go instances support stateful and dynamic business loads. The yearly and monthly subscription instances, volume-based instances+deduction products support stable business load.

As shown in the figure above, the payment methods of computing resources are compared. Among them, the flexibility of computing resources mainly refers to whether resources can be created, released and allocated at will, and the degree of coupling between resources and finance.

As shown in the figure above, compare the payment methods of storage resources. Among them, pay-as-you-go offloading and release are unlimited, and are applicable to businesses with irregular elasticity.

As shown in the figure above, the payment methods of network resources are compared. Among them, billing by fixed bandwidth is applicable to more stable services, and billing by usage is applicable to the sharp business scenarios, that is, the occasional traffic is particularly large, and the traffic in most scenarios is relatively small.

Shared traffic packets are suitable for scenarios that have a certain ability to predict traffic. Otherwise, if the traffic packets are purchased too large or not deducted in time, it will cause waste.

The above figure shows the scenario classification of resource specification selection. Users can select according to their own scenarios. Only by selecting the specifications that are suitable for your business scenario can you obtain the best price performance.

For example, the cost of shared burst performance instance t5/t6 is low, which is very suitable for low-performance load business scenarios such as lightweight web applications, development/test environments. The price can reach 30% to 60% of the corresponding exclusive specifications.

For example, an e-commerce website selects a computing instance (4vCPU) based on its own business characteristics, which reduces the cost by more than 20% compared with the general type.

In terms of resource specification selection, we recommend that you choose the latest generation. Because the latest generation means that the software or hardware of the cloud manufacturer has been upgraded and can enjoy the technology dividend of cloud computing. In response to this, Alibaba Cloud issued an official announcement on July 6, 2022: the price of some regions of C6/C7, G6/G7, R6/R7 has been reduced by 9% to 19%.

03 Improve resource utilization

Improve resource utilization, mainly for deduction products. Because the deduction products can not be matched, and the utilization rate and coverage rate are insufficient. Therefore, the estimated capacity is difficult to reach 100%. Users need to pay attention to the utilization rate and coverage rate of deduction products.

If the capacity reaches 100%, some bills may not be deducted. Users need to purchase or re-purchase reserved instance securities or savings plan products.

In addition, elastic scaling can effectively improve resource utilization. Elastic expansion is divided into vertical elasticity and horizontal elasticity. Vertical elasticity refers to the increase or decrease in the number of instances, for example, 100 ECSs become 200 or 50. Horizontal elasticity means that the CPU memory of an ECS increases or decreases, and the configuration is increased or decreased.

Both elastic scaling modes support timing mode, dynamic mode, dynamic prediction mode, health mode, manual mode, and multiple modes.

The above figure shows other ways to reduce costs and increase efficiency. Preemptive instances can achieve stable delivery of computing power clusters with the help of products such as flexible supply and flexible scaling.

If the pay-as-you-go instance is used for a long time, it can be converted to monthly package, and you can choose to pay by week. You can also purchase reserved instance vouchers, savings plans and other deduction products to reduce costs. Start the shutdown saving mode, and use the automatic operation and maintenance tool OOS product to realize periodic timing shutdown.

The monthly package instance can be automatically renewed to improve the renewal efficiency with the help of a unified expiration date. If it is no longer used, you can unsubscribe or transfer to pay-as-you-go.

With the help of automatic operation and maintenance tool OOS, the bandwidth can be periodically and regularly increased and decreased. Purchase shared bandwidth packets and shared traffic packets. Bandwidth between multiple products can be reused and managed uniformly.

In addition, users can authorize the intelligent consultant advisor product to regularly scan their own resources and give cost optimization suggestions. Cost analysis and optimization are carried out with the help of user center cost analysis function.

04 Cost management

From the perspective of financial personnel, cost management has four requirements.

First, it is clear how much money each department within the enterprise consumes each month.

Second, set up a budget to manage the expenses of each department.

Third, tools are needed to analyze from various dimensions&perceive their own costs, judge whether their own costs are reasonable and whether there is room for optimization.

Fourth, cost anomaly detection capability.

When creating resources, the system will automatically assign resources to departments or teams using labels according to the allocation policy.

So we can clearly see how much money each team and department has spent.

Budget management means that users set budgets according to certain conditions. If the budget is exceeded, the user can set an alert and send an email or SMS.

Cost analysis and optimization is to analyze whether the cost is reasonable from all dimensions. Users can filter and analyze according to the label, product, region, available region, instance specification and other conditions.

The cost anomaly detection uses artificial intelligence algorithm to identify the expense fluctuation anomaly. Anomaly detection supports evaluation feedback. The user's feedback results will participate in the training algorithm. The more and more accurate the feedback, the more conducive to improving the detection accuracy.

Payment methods such as monthly package, pay-as-you-go, preemptive instance, and reserved instance coupons are the balance of economy, certainty, and flexibility. Enterprises need to pay attention to the applicable scenarios of each product. Only by choosing the right one can we achieve cost savings.

Enterprises need to select appropriate instance specifications according to their own business scenarios. For example, the instance specifications of computing type, general type and burst performance type.

Vertical and horizontal elastic scaling, downtime saving mode, automatic operation and maintenance tool OOS products, etc., can effectively improve the resource utilization of enterprises. Cost analysis and optimization mainly analyzes whether the cost of the enterprise is reasonable from various dimensions.

