By Elastic Compute Service Team
Auto Scaling, a popular cloud service orchestration product on Alibaba Cloud, enables automatic adjustment of elastic computing resources based on your defined policies and business needs. It ensures the most rational and effective infrastructure costs while supporting any changes in business loads. Based on your scaling policies and scaling mode, Auto Scaling automatically increases Elastic Compute Service (ECS) instances to ensure computing capabilities when your business demand grows, and reduces ECS instances to save costs when your business demand declines. Besides, Auto Scaling automatically replaces unhealthy ECS instances to ensure normal business loads at any time. In this way, Auto Scaling delivers true elastic processing capabilities for business loads in complex scenarios without manual intervention.
We have received a lot of valuable feedback from users. Recently, we have fully upgraded Auto Scaling Service to allow you to deal with business changes more flexibly and effectively and ensure fast and stable business development with higher cost-effectiveness. The following part further describes these updates.
In practice, you usually need to add or modify Server Load Balancer (SLB) instances or ApsaraDB for RDS instances that are already bound to a scaling group. However, once a scaling group is created, its configurations of SLB instances or ApsaraDB for RDS instances could not be modified. Therefore, you must create a new scaling group in that case. After the function upgrade, Auto Scaling now supports attaching and detaching SLB instances and ApsaraDB for RDS instances, making it easy to deal with architectural changes or upgrades without creating a new scaling group.
Auto Scaling Service integrates with Server Load Balancer (SLB), which allows you to attach an SLB instance to a scaling group and distribute traffic to each instance in the scaling group using the SLB instance. For a long time, SLB instances were specified only when a scaling group was created and cannot be modified. It means that you must carefully consider your business demand and required number of SLB instances when you create a scaling group. This awkwardness is now eliminated as we have launched two new functions of Auto Scaling Service: AttachLoadBalancer and DetachLoadBalancers.
Attach an SLB instance to a scaling group
You can attach an SLB instance to a scaling group, during which either of the following actions is performed based on the specified value of forceAttach:
To add all instances in a scaling group to the backend of an exiting SLB instance, you can attach this SLB instance to the scaling group again and set the value of forceAttach to true.
Note that due to the limitations of SLB instance types, the SLB instance to be attached to a scaling group must meet the following conditions:
Detach an SLB instance from a scaling group
When you detach an SLB instance from a scaling group, either of the following actions is performed based on the specified value of forceDetach:
Before detaching an SLB instance from the scaling group, you should confirm that the SLB instance no longer distributes requests to other instances in the scaling group, to avoid loss of service requests. Furthermore, unlike the attachLoadBalancer operation, you cannot try to detach the same SLB instance from the scaling group multiple times.
ApsaraDB for RDS, a stable and reliable online database service of Alibaba Cloud, supports MySQL, SQL Server, PostgreSQL, and PPAS. It provides a complete set of solutions for disaster recovery, backup, restoration, monitoring, migration, and other features, to free you from database O&M. Auto Scaling Service integrates with ApsaraDB for RDS, which enables the instances in a scaling group to be automatically added to a whitelist, ensuring secure access to the ApsaraDB for RDS instances.
Attach an ApsaraDB for RDS instance to a scaling group
When you attach an ApsaraDB for RDS instance to a scaling group, either of the following actions is performed based on the value of forceAttch:
If you attach an existing ApsaraDB for RDS instance to the scaling group again, the number of ApsaraDB for RDS instances in the group remains unchanged but the private IP addresses of all instances in the group are added to the IP address whitelist.
Note that the ApsaraDB for RDS instance to be attached to a scaling group must meet the following conditions:
Detach an ApsaraDB for RDS instance from a scaling group
When you detach an ApsaraDB for RDS instance from a scaling group, either of the following actions is performed based on the specified value of forceDetach:
You can set the value of forceDetach as needed. Note that you cannot repeat the removal operation for the same ApsaraDB for RDS instance.
Similar to scaling groups, to avoid creating new scaling configurations, Auto Scaling Service supports configuration modifications and more ECS features, such as image-based default password.
Auto Scaling Service supports modification of the following parameters:
To ensure higher flexibility and elasticity, Auto Scaling Service supports four additional features, namely, UserData, KeyPair, RamRole, and Tags. UserData allows you to complete the automatic configuration process quickly and securely. When the number of ECS instances changes with the business needs, you can perform application-level scale-up and scale-down quickly and securely. Also, you can configure KeyPair, Tags, and other parameters for more efficient and intelligent ECS instance management.
Custom instance data (UserData)
UserData, custom instance data, is a feature provided by Alibaba Cloud ECS to customize instance startup behaviors and import data. This feature is compatible with Windows instances and Linux instances and is mainly used as:
To automatically scale up and scale down application-level ECS instances using Auto Scaling Service based on your business needs, you can use custom images generally or use open-source IT Infrastructure management tools, such as Terraform. After the UserData parameter is added to Auto Scaling Service, you only need to prepare custom script data (UserData) and import it into the scaling configuration as Base64-encoded data. When Auto Scaling Service is used to scale up and scale down ECS instances, it automatically executes the custom instance script (UserData) for application-level scale-up and scale-down. Compared with custom images or other open-source tools, the native UserData feature of Auto Scaling Service ensures more fast and secure automatic scale-up and scale-down.
Pay attention to the following aspects when you create a scaling configuration and use the UserData parameter:
For more usages of UserData, see Alibaba Cloud Custom Instance Data.
SSH key pair (KeyPairName)
You can log on to a remote Linux server over SSH using the password or SSH key. When you need to manage multiple server clusters, it is not only time consuming to enter the password frequently, but also easy to cause logon failures due to password input errors. In such case, if you log on to the server using the SSH key, you only need to configure you public key and private key. The keys are valid for a long time once configured.
The SSH key created by Alibaba Cloud only supports RSA 2048-bit key pairs. When Alibaba Cloud generates the SSH key, it keeps the public key and returns the private key to you.
The KeyPairName parameter in the auto scaling configuration allows you to log on to the server using the SSH key. When you create a scaling configuration, you can select the desired key pair name for the KeyPairName parameter. An ECS instance created using Auto Scaling Service keeps the public key. You only need to configure the private key on your local device, to log on to the server quickly using the SSH key.
Pay attention to the following aspects when you create a scaling configuration and use the KeyPairName parameter:
RAM role name (RamRoleName)
Resource Access Management (RAM) is a service provided by Alibaba Cloud for user identity management and access control. Using RAM, you can create and manage user accounts (for example, employees, systems, and applications) and control the operation permissions for resources under them. If multiple users in your enterprise collaboratively work with resources, RAM allows you to avoid having to share the AccessKey of your Alibaba Cloud account with other users. Instead, you can grant users the minimum permissions necessary for them to complete their work, reducing your enterprise's information security risks.
RAM supports creating roles that have different operation permissions for different cloud products. You can set RamRoleName, a new parameter for Auto Scaling Service, to specify different roles for your ECS instance. Then the ECS instance has different operation permissions for cloud products. Note that when you specify the RamRoleName parameter, you must make sure that the current RamRole policy allows your ECS instance to act as the specified role; otherwise, the scaling configuration cannot effectively make the ECS instance available.
The tag service is available for Alibaba Cloud ECS instances. That is, you can bind different tags to your ECS instance for category management.
You can query tags to obtain a list of matching ECS instances, and in turn, you can query ECS instances to obtain the matching tags. You can set Tags, a new parameter for Auto Scaling Service, to specify the tag pairs of all available ECS instances for category management. Each scaling configuration supports up to five pairs of tags currently. When the specified number of tag pairs exceeds five, the scaling configuration fails.
The core of Auto Scaling lies in availability of ECS instances for horizontal scale-up. However, the inventory of cloud computing resources changes dynamically, and inventory shortage is always a problem we face. To maximize the creation success rate, Auto Scaling supports multiple instance types in addition to multi-zone scale-up. This is our edge over other vendors.
Support for multi-zone scale-up
Due to limitations of previous Auto Scaling Service, only one VSwitch can be configured for each VPC scaling group in the past. One VSwitch belongs to one zone only. When you configure a VSwitch for the scaling group but the VSwitch cannot create an ECS instance in its zone due to inventory shortage, the scaling configurations, scaling rules, and alarm tasks for the scaling group are valid. To solve the preceding problem and improve the scaling group availability, we add VSwitchIds.N, a new multi-zone parameter. You can set VSwitchIds.N to specify multiple VSwitches for your scaling group. If one VSwitch cannot create an ECS instance in its zone, Auto Scaling Service automatically switches to another zone. Pay attention to the following aspects when you use this parameter:
When the VSwitch with the highest priority cannot create an ECS instance in its zone, the system automatically selects the VSwitch with the next highest priority to create an ECS instance. When you create a scaling group using the VSwitchIds.N parameter, use the VSwitches in different zones but in the same region if possible. This can effectively reduce occurrence of the problem that one VSwitch cannot create an ECS instance in its zone, improving the scaling group availability.
Support for up to 10 instance types
Due to limitations of Auto Scaling Service, only one scaling configuration is valid in a scaling group and it is used for one instance type only in the past. Due to the limitations, only one instance type is valid in a scaling group. If the current instance type is unavailable due to inventory shortage, the scaling group cannot create an ECS instance. You need to select another scaling configuration in the scaling group or create a scaling configuration to restore the scaling group. To solve the preceding problem and improve the scaling configuration availability, we add InstanceTypes.N, a new multi-instance-type parameter. You can set InstanceTypes.N to specify multiple instance types for your scaling configuration. If one instance type is unavailable due to inventory shortage, Auto Scaling Service automatically switches to another available instance type. Pay attention to the following aspects when you create a scaling configuration using the InstanceTypes.N parameter:
To meet high availability and disaster recovery requirements in multi-zone instance scenarios and thus ensure service stability and continuity, we provide the automatic multi-zone instance balancing function for Auto Scaling Service to reduce the impacts of force majeure.
Auto Scaling Service creates ECS instances in multiple zones across regions, which allows you to ensure security and reliability by taking advantages of geographic redundancy.
Supported scope of the automatic multi-zone instance balancing function:
How to set the automatic multi-zone instance balancing function:
MultiAZPolicy, a new multi-zone elastic policy parameter, is added for a scaling group. The values are:
When this parameter is set to BALANCE, the instances in all zones are automatically balanced when the scaling group performs a scaling activity.
The default value is PRIORITY. The system scales up/down ECS instances based on the defined VSwitch priority (VSwitchIds.N). When the VSwitch with the highest priority cannot create an ECS instance in its zone, the system automatically uses the VSwitch with the next highest priority to create an ECS instance.
In the following cases, the ECS instances in a scaling group may be not balanced between different zones:
In any of the preceding cases, you can execute RebalanceInstances to rebalance the ECS instances in the scaling group.
We provide three new management functions to allow users to manage their ECS instances more flexibly and meet the requirements in some special scenarios:
Support for standby status operations
You are unable to control the lifecycle of ECS instances managed in scaling groups. The scaling groups release unhealthy ECS instances, which also keeps you from performing halt-related operations on scaled ECS instances. As a result, you are unable to take full advantages of elasticity provided by ECS.
The standby status operations can meet the requirements in the following scenarios:
We have launched the LifecycleHook feature to allow users to manage ECS instances in scaling groups more flexibly. The LifecycleHook feature suspends the scaling activity that occurs in a scaling group to perform custom operations.
Specifically, the LifecycleHook feature suspends the ECS instance that is scaling or to be released, to perform custom operations. This allows users to manage the lifecycle of ECS instances in scaling groups more flexibly. The following are several simple application scenarios of the LifecycleHook feature:
In the second scenario, if the maximum processing time for each request can be determined, you can call the Create LifecycleHook API to create a lifecycle hook. Set the value of LifecycleTransition to SCALE_IN, and the value of HeartbeatTimeout to the maximum processing time, without setting a notification object. When a scaling activity occurs, the LifecycleHook feature suspends the ECS instance for a certain period (HeartbeatTimeout) after the ECS instance is removed from SLB, to allow all requests to be processed.
To allow users to trigger auto scaling events from more monitoring dimensions, we have increased the number of metrics from 6 to 13. Custom metrics are also supported.
Alarm tasks in Auto Scaling
Alarm tasks in Auto Scaling represent in-depth cooperation with Cloud Monitor Service (CMS), which provides a dynamic scaling group management mode. Similar to scheduled tasks in Auto Scaling, alarm tasks in Auto Scaling enable triggering specified scaling rules to perform scaling activities. In this way, the number of ECS instances in a scaling group can be adjusted.
|System disk write BPS||Byte/s|
|System disk read BPS||Byte/s|
|System disk write IOPS||Nos/s|
|System disk read IOPS||Nos/s|
|Number of packets transmitted from Internet NIC (classic network)||Nos/s|
|Number of packets received by Internet NIC (classic network)||Nos/s|
|Number of packets transmitted from intranet NIC||Nos/s|
|Number of packets received by from intranet NIC||Nos/s|
|Total TCP connections||Nr.|
|Number of established TCP connections||Nr.|
Scheduled tasks execute the specified scaling rule at the specified time, which provides a response in advance when the business scenarios are time-predictable. However, they become insufficient when the business scenarios are unexpected or not time-predictable. In such case, alarm tasks provide a more flexible way to trigger scaling rules. Auto Scaling Service increases ECS instances in a scaling group to share business loads in peak traffic period and releases ECS instances in the scaling group in non-peak traffic period, to reduce production costs.
Alarm tasks monitor specific metrics to collect statistics on data metrics in real time. When the statistics meet the specified alarm condition, this function triggers an alarm for executing the specified scaling rule. By using alarm tasks, you can adjust the number of ECS instances in a scaling group based on business changes in real time, to ensure that the metrics are within the expected range.
Alarm tasks in Auto Scaling
The metrics for alarm tasks in Auto Scaling are ECS instance data metrics (such as CPU and load) collected by CloudMonitor. The metric-based alarm tasks in Auto Scaling use scaling groups as monitoring granularity. That is, the average metrics of all instances in a scaling group are the metrics of the scaling group. The metrics are updated as the number of instances in the scaling group changes.
New system metrics:
Custom metric-based alarm tasks in Auto Scaling
The monitoring objects of custom metric-based alarm tasks in Auto Scaling are the metrics users reported to CloudMonitor independently. In some scenarios, the system metrics may not include your desired metrics, and you may have an internal monitoring system for some metrics related to your specific business. In these scenarios, the custom metric-based alarm tasks provide alarm task access points for your internal monitoring system or business-related metrics.
The custom metric-based alarm tasks in Auto Scaling are set based on the custom metrics in Alibaba Cloud CloudMonitor. Before using Auto Scaling to customize metric-based alarm tasks, users should report custom monitoring data, that is, custom metrics, to CloudMonitor. CloudMonitor metric customization is a service that enables users to freely customize metrics and alarm rules. By using this service, users can monitor the business indicators they concerned about, report the collected monitoring data to CloudMonitor for processing, and set alarm rules.
To further improve user experience, we have abandoned the previous notification mode of SMS + email and developed the new notification mode that allows you to choose recipients, select notification tools (DingTalk + SMS + email), and edit the receipt content. The programmable notification modes Topic and Queue are also supported to improve user experience to the largest extent.
The event notification function of Auto Scaling supports event notifications at the scaling group level. It allows you to configure event notifications for your scaling groups and also types of scaling activities to be notified. When a scaling activity occurs, the event notification function pushes the details of the scaling activity to the notification object you configured. Currently, the event notification function supports three types of notification objects and five types of scaling activities. The event notification function allows you to immediately learn the changes of instances in a scaling group, to monitor the scaling group information in real time.
Scaling activity types supported by the event notification function
When you create an event notification, you need to set the scaling activity type that triggers an event notification. When a scaling activity occurs in the scaling group, the event notification is triggered and the details of the scaling activity are sent to the notification object you configured.
The event notification function currently supports five types of scaling activities:
In the preceding scaling activities, AUTOSCALING:SCALE_OUT_SUCCESS and AUTOSCALING:SCALE_IN_SUCCESS include partial success and full success. You can determine whether a scaling activity belongs to partial success or full success based on the activity details from its event notification. Or, you can use the DescribeNotificationTypes API to query the scaling activity types supported by the event notification function.
Notification modes supported by the event notification function
The event notification function should, if triggered by a scaling activity, report the details of the scaling activity to a notification object. Currently, this function supports three notification modes:
Preemptive instances are post-paid instances with a price fluctuating as the supply-demand relationship changes. They have a higher discount than Pay-As-You-Go instances. To purchase a preemptive ECS instance, you can set your highest bid price for the instance. The market price of a preemptive ECS instance fluctuates as the supply-demand relationship changes, and is currently 10% to 100% off the price of a Pay-As-You-Go instance. In this mode, you can purchase a preemptive ECS instance at a price not higher than your highest bid price. When the market price is higher than your highest bid price, Alibaba Cloud does not generate a preemptive ECS instance for you, which allows you to control the production cost in the expected range.
Note that the preemptive instances are exposed to certain risks along with their low prices. When the market price is higher than your highest bid price or the supply-demand relationship is in serious imbalance, Alibaba Cloud has the right to release your ECS instance. Alibaba Cloud will send a notification of metadata to you five minutes before releasing your ECS instance. You can subscribe to Alibaba Cloud metadata to save and clear data in time.
Compared with Pay-As-You-Go instances of the same type, preemptive instances save a lot of server costs for you. They are the best choice in the following applications scenarios:
The new functions of Auto Scaling Service aim to make it easier for you to deal with business load changes while maintaining low TCO as your elastic supporting business grows. To learn more about Auto Scaling, visit www.alibabacloud.com/product/auto-scaling
ApsaraDB - April 1, 2019
Alibaba Cloud Product Launch - December 12, 2018
Alibaba Clouder - December 20, 2018
Alibaba Clouder - August 28, 2019
Alibaba Container Service - April 11, 2019
Alibaba Clouder - August 14, 2018
An online computing service that offers elastic and secure virtual cloud servers to cater all your cloud hosting needs.Learn More
Auto Scaling automatically adjusts computing resources based on your business cycleLearn More
Respond to sudden traffic spikes and minimize response time with Server Load BalancerLearn More
Resource management and task scheduling for large-scale batch processingLearn More
More Posts by Alibaba Clouder