How to quickly build low-cost and highly elastic cloud applications

01 Cloud application resource selection

There are three main factors that need to be considered in building cloud applications, namely stability, cost and flexibility. Among them, stability includes infrastructure stability, application stability and application observability.

Users need to select a cloud platform with stable infrastructure, stable instance operation and fast recovery. In terms of application observability, users can use the monitoring data and monitoring applications of the cloud platform to maintain the stability of the application.

In terms of cost, it mainly involves instance specification, payment type and resource management. The type of payment determines the application cost. Even if the same instance uses different payment types, the cost will vary greatly.

Because of the contradiction between stability and cost, users need to add more resources and machines to improve the stability of services, which will lead to higher costs. When the application is faced with a sudden increase in load, the stability will decline. Elasticity can effectively resolve the conflict between stability and cost.

02 On-cloud application building considerations



1. Example selection

Alibaba Cloud mainly provides three types of instances, namely universal computing, heterogeneous computing, bare metal and high-performance computing. Users need to select different instances according to the application characteristics.

If the application is a memory-based database, users can select memory-based instances to avoid resource waste caused by resource mismatch. Bare metal and heterogeneous computing are suitable for applications with high resource requirements, such as machine learning.

2. Payment method

Alibaba Cloud has two payment methods, namely post-payment and pre-payment.

◾ Post-payment means that users use the instance first and then charge. It is divided into two categories, namely, pay-as-you-go and preemptive instance. Among them, preempting instances can effectively reduce the cost of applications.

◾ Prepaid payment is billed before actual use.

As shown in the figure above, the instance of prepaid package cannot be released actively. For volume-based and preemptive instances, users can actively release instances. Reserved instances provide preferential instances through the way of deduction by quantity of instances.

3. Spot instance

The price of the Spot instance changes dynamically. Users bid according to the change of the Spot instance.

If the user's bid is higher than the price fluctuation of the Spot instance, the instance can be used all the time. When the price of the Spot instance is higher than the user's bid, the system will release the instance. Although the instance price of Spot is cheaper than the volume, Spot has the risk of systematic and automatic recovery, and its stability is relatively poor.

4. Low-cost resource management scheme

The application load is mainly divided into three stages.

◾ In the first phase, the basic load is applied, and the monthly package and reserved instance are used to support the stable business load.

◾ In the second phase, the daily peak load is applied and the volume-based instance is used to support the stateful and dynamic business load, ensure the service stability and save costs. When the peak load of the application comes, the elastic capacity expansion in peak period and elastic capacity reduction in low peak period can be guaranteed through dynamic expansion and volume-based instance.

◾ In the third stage, the load is suddenly increased, and preemptive instances are used to support stateless and fault-tolerant business loads. Users can ensure the stability of services, spend less money and experience better results through preemptive instances.

03 Elastic, stable and low-cost cloud solutions

1. Introduction to elastic scaling function

As shown in the above figure, the right figure shows the basic functions of elastic scaling. The left figure shows the operation and maintenance scenario of traditional business resources. The black line represents the business load, which is dynamic; The green curve represents the deployment of resources. Resources are deployed through traditional human intervention. Users hold fixed resources for a long time.

When the traffic reaches the first point, the business load increases to the peak period, and the resource volume covers the traffic in the peak period, and the service load does not change significantly. During the low peak period of business, there was no volume reduction. The service resources are still deployed according to the maximum traffic. When the current business load suddenly increases, the original resource deployment cannot meet the business load demand, and the business stability is affected.

The traditional manual intervention mode has three disadvantages, namely, resource waste, affecting the stability of services, and high manual operation and maintenance costs.

In the elastic scaling mode above, the black curve represents the business load and the green curve represents the resource volume. When the service load increases suddenly and the amount of resources increases, the elastic expansion ensures the stability of the service. When the business load is reduced, the service stability can be guaranteed through elastic capacity reduction, thus saving costs.

Compared with traditional methods, the resource cost of elastic scaling is lower, and there is no obvious resource waste scenario. Resilient scaling services are more stable, and can ensure the stability of services even in the scenario of sharp increase in traffic. In addition, the system automation management does not require human intervention and reduces labor costs.

2. Necessary conditions for expansion

The necessary conditions for elastic scaling are shown in the figure above. Not all applications can access the elastic scaling capability immediately. Elastic scaling mainly includes three aspects, namely, monitoring capability, deployment and update capability, and self-service capability.

There are three main monitoring capabilities: monitoring acquisition capability, index aggregation capability, and monitoring alarm capability. The monitoring capability needs to understand the business load, CPU indicator or QPS indicator measurement. The monitoring alarm capability means that when the CPU usage is greater than 50%, an event is triggered to expand and shrink the capacity elastically.

The indicator aggregation capability needs to be aggregated according to the application. If the private cloud has 100 machines, the application only uses 20. When aggregating indicators, only 20 machines need to be aggregated. In addition, the CPU indicators of 20 machines can also be averaged to achieve specific indicators aggregation.

When the application is elastically scalable, the deployment and update capabilities required involve three core indicators.

First, understand the application software deployment media. For example, image publishing method.

Second, understand the automated deployment method. During elastic expansion, you need to expand instances and deploy applications.

Third, application automation upgrade. When there are ten machines in an application, the elastic capacity is expanded by two. During application deployment, in addition to the ten machines deployed, two additional machines need to be deployed. Therefore, users need to consider the application deployment and upgrade capability.

In terms of self-service capability, after elastic expansion, users need to judge whether the application instance can provide services normally.

When an instance is elastically expanded, a web service is started. However, the web service is not attached to the corresponding load balancing. At this point, users need to evaluate their own services and whether they have the ability to self-serve. Whether the external service depends on registration, and whether the application instance has the ability of automatic registration and logout.

3. Core concept of elastic expansion

Through Alibaba Cloud elastic scaling, a group of machines are located, and the instance of the scaling group is used for indicator collection and instance deployment.

There are two main points in scaling configuration. In terms of instance specification, instance configuration and instance expansion require additional parameter configuration. If you need to manage it, you can mark it.

For instance mirroring, if the container is used to provide services, you can specify the application image. So as to ensure the elastic expansion of instances, meet user needs and provide services.

In terms of scaling rules and notifications, when elastic scaling is triggered, Alibaba Cloud can notify scaling activities of success and failure, and users can also refuse real-time notifications. Elastic scaling is connected to the cloud monitoring system event and MNS topic queue.

In terms of scaling tasks, there are three types of scaling tasks, namely, timed tasks, alarm tasks, and automatic or manual triggering.

Timed tasks mean that there is an obvious time rule in the peak and low load periods. Users can expand their capacity regularly before the peak period; After the peak period, the volume is reduced regularly.

Alarm tasks can be dynamically expanded and scaled through CPU or QPS monitoring indicators. Automatically or manually trigger the expansion and contraction to perform dynamic expansion.

4. Telescopic mode

As shown in the figure above, the scaling health mode will release or remove unhealthy ECS instances. The scaling group provides this capability for all modes by default.

Scaling fixed mode guarantees a fixed number of ECS instances by specifying MinSize. It is suitable for scenarios with small business fluctuations but high availability requirements, and is generally used together with the monitoring mode.

In manual scaling mode, ECS instances will be scaled manually through API and scaling rules will be executed manually according to the monitoring data observed manually or the user's own monitoring system. After manually adjusting MinSize or MaxSize, ECS instances are automatically created or released, and the number of instances is maintained between Min and Max.

Scaling timing mode: increase or decrease ECS instances according to the configuration timing, such as 13:00:00 on Friday. It is suitable for scenarios with regular business fluctuations.

Scaling dynamic mode, based on the load of monitoring indicators, automatically creates or releases ECS instances according to the configuration. It is suitable for scenarios where business fluctuation has no obvious regularity. When the CPU of a single machine is more than 50%, some instances can be compensated to ensure service stability and reduce the load of a single machine.

5. Expansion rules

How to expand or shrink when an event triggers expansion or shrinking?

As shown in the figure above, the common scaling rule refers to the assumption that when the CPU is greater than 20%, the capacity of four sets will be expanded. Due to the failure to meet the transaction requirements of the business load, the machine load continues to rise and the expansion continues to rise.

The step scaling rule defines the trigger threshold for expansion. Different thresholds will trigger different expansion actions.

The target tracking scaling rule is to maintain the CPU at 50%. If the service load suddenly increases, the system calculates that the next point needs to be expanded, with more than 20 machines. According to the target tracking scaling rules, more than 20 sets will be expanded at one time, so as to quickly respond to sudden traffic scenarios.

6. Best practices for scaling mode

Next, we will introduce how to select different scaling modes for different application scenarios.

When the high and low peaks are relatively fixed in the application scenario, users can use the timing mode to increase or decrease ECS instances regularly according to the configuration, such as 13:00:00 on Friday.

For application scenarios with high stability and sudden traffic, users can use protection and dynamic mode to manually add monthly package instances to ensure the business base. When the business load suddenly increases, some resources are compensated by dynamic scaling to ensure the stability of the service. When the service has sudden traffic, fully guarantee the service resources.

If users have high cost requirements, they can configure their own policies by using volume-based instances and dynamic scaling. Based on the load of cloud monitoring indicators, such as CPU utilization, it scales automatically according to the configuration.

For sudden scenarios with relatively stable load peaks and peaks, users can dynamically further adjust the number of ECS instances based on the monitoring indicators on the basis of the expansion and contraction of the capacity in the regular configuration.

7. Cost optimization best practices

When the instance resource has the ability to live dynamically, how to expand the instance? As shown in the figure above, the most ideal way is that the base is an instance of monthly package to ensure service stability. When the service peak comes, the on-demand usage is more stable than the Spot instance. It has relatively low cost and high stability.

8. Customer case

As a global technology platform, Mobvista is committed to promoting global business growth in the digital era. We will focus on building a "SaaS tool ecosystem" that enables enterprise growth, and help enterprises grow globally.

Due to the large volume of advertising business and the large demand for resources, Huiliang Technology has a high demand for stability. In order to meet customers' demand for resources at low cost under the scenario of sufficient resource demand.

Alibaba Cloud meets customer needs by using elastic scaling, pay-as-you-go, and Spot combination. Through the automatic compensation scheme, the service stability is guaranteed and the resource cost is reduced by 30% to 40%. Automatic compensation refers to that when the Spot instance is recycled, elastic expansion senses that the Spot is recycled, and automatically compensates a volume-based instance to replace the Spot, further ensuring the stability of the service.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us