Function service provides two instance usage modes: pay-as-you-go mode and reserved mode. The fixed reservation value of the reservation mode will lead to insufficient utilization of the instance. In this case, the Auto Scaling function can solve the problem of insufficient utilization of the instance. This topic describes the functional principles, advantages and disadvantages, and billing methods of the two instance usage modes, as well as how to configure reserved instances in the Function service console and how to configure reserved instances in the Auto Scaling.

On-demand mode

The pay-as-you-go mode means that the allocation and release of function instances are entirely the responsibility of the Function service system. When there is a function call request, the system dynamically schedules resources, providing you with a flexible and reliable execution environment, greatly reducing the difficulty of managing application resources.

However, cold start is unavoidable during dynamic scheduling of resources, which has negative impacts on online applications that are sensitive to response latency.

Billing method: If no function invocation request is made, no instance is allocated. If it is not used, no fees are charged.

Provisioned mode

The reservation mode is to manage the allocation and release of function instances. When you reserve a function instance and the Function Compute system receives a function invocation request, it preferentially forwards the request to your reserved function instance. When the peak value of the function request exceeds the processing capacity of the reserved function instance, some of the remaining requests will be forwarded to your instance in pay-as-you-go mode. The Function Compute system automatically assigns you an execution environment.

An instance in reserved mode is ready for use after it is created. This eliminates the impacts caused by cold start.

Billing method: The execution duration of an instance in reserved mode is billed based on the run time length of the instance. The execution duration is measured from the time when the reserved function instance is started by the Function Compute system until you actively release it. Therefore, even if an instance in reserved mode that is not released does not process any requests, you must pay for it. For more information about pricing and billing, see Billing. On-Demand Resources

Prerequisites

Configure Reserved Instances

  1. Log on to the Function Compute console.
  2. In the left-side navigation pane, click Services and Functions.
  3. In the top navigation bar, select the region where the service resides.
  4. On the Services page, find the target service and choose More > Auto Scaling (including reserved resources) in the Actions column.
  5. On the Auto Scaling (including reserved resources) page, click Create Rule.
  6. On the Create Auto Scaling Restriction Rule page, set the parameters and click Create.
    reserved-mode-instances
    Parameter setting Description
    Standard
    Version or alias Select the version or alias of the reserved instance to create from the list.
    Note You can create reserved examples only in the LATEST version. Other versions are not supported.
    Function The function that you want to use the provisioned instances to execute.
    Minimum number of instances The number of provisioned instances to be created. Minimum number of instances=Number of reserved instances.
    Note By limiting the minimum number of instances at the function level, you can quickly respond to function invocation requests, reduce the number of cold starts, and provide better service responses for latency-sensitive online businesses.
    Maximum number of instances Enter the maximum number of instances in the text box. Maximum number of instances=Number of reserved instances + Maximum number of pay-as-you-go instances.
    Note By limiting the maximum number of instances at the function level, you can prevent instance usage caused by excessive calls to a single function, protect back-end resources, and avoid unexpected costs and overheads.
    (Optional) Timed modification limit: You can configure reserved function instances more flexibly by setting timed scaling. You can set the number of reserved function instances to the required value at a specified time, so that the number of function instances can better fit the concurrency of the business.
    Policy name Enter a custom policy name in the text box.
    Minimum number of instances Set the reserved quantity as needed in the text box.
    Timed expression (UTC) The timing information. Example: cron(0 0 20 * * *). For more information, see Parameter description.
    Effective Time (UTC) In the text box, set the start and end effective time of the scheduled Auto Scaling.
    (Optional) Modify limits based on metrics: Scale the reserved resources every minute based on the concurrent utilization of the function instance.
    Policy name Enter a custom policy name in the text box.
    Minimum Instance Range Set the minimum and maximum values of the minimum number of instances as needed in the text box.
    Utilization Threshold Specify the scaling range. Scale-out is performed when the utilization is lower than this parameter. Scale-out is performed when the utilization is higher than this parameter.
    Effective Time (UTC) In the text box, set the start and end effective time of the metric Auto Scaling.
    You can view the created provisioned instances on the Provisioned Resources tab. reserved-mode-instances-result

Updating Reserved Instances

  1. Log on to the Function Compute console.
  2. In the left-side navigation pane, click Services and Functions.
  3. In the top navigation bar, select the region where the service resides.
  4. On the Services page, find the target service and choose More > Auto Scaling (including reserved resources) in the Actions column.
  5. On the Auto Scaling (including reserved resources) page, find the target rule and click Edit in the Actions column.
    Note To delete provisioned instances, set the Reserved Instances parameter to 0.
  6. On the Create Auto Scaling Limit Rule page, modify the basic configuration, modify the limit regularly, or modify the limit based on the metric, and then click Save.

Configure Auto Scaling for Reserved Instances

You can perform scheduled auto-scaling or metric tracing auto-scaling to make better use of reserved instances.

Timed Auto Scaling

  • Definition: Scheduled auto-scaling is used to flexibly configure reserved instances. You can configure the number of reserved instances to be automatically adjusted to a specified value at a specified time so that the number of instances can meet the concurrency of your business.
  • Scenario: You can use scheduled auto-scaling to reserve instances in advance of periodic or predicted traffic peaks for functions. When the number of instances invoked concurrently by a function is higher than the reserved instance concurrency, the excess instances are charged on a pay-as-you-go basis.
  • Example of configuration: Two scheduled operations are configured. The first scheduled operation scales out the reserved instances before the traffic peak, and the second scheduled operation scales in the reserved instances after the traffic peak. instance
Example of parameter settings:
  • In this example, a function named function_1 in a service named service_1 is configured to automatically scale in and out. Set the scaling period for function_1 to the period from 10:00:00 on November 1, 2020 to 10:00:00 on November 30, 2020 (UTC+8). The number of reserved instances is scaled out to 50 at 20:00 every day and scale in to 10 at 22:00 every day.
  • {
      "ServiceName": "service_1",
      "FunctionName": "function_1",
      "Qualifier": "alias_1",
      "SchedulerActions": [
        {
          "Name": "action_1",
          "StartTime": "2020-11-01T10:00:00Z",
          "EndTime": "2020-11-30T10:00:00Z",
          "TargetValue": 50,
          "ScheduleExpression": "cron(0 0 20 * * *)"
        },
        {
          "Name": "action_2",
          "StartTime": "2020-11-01T10:00:00Z",
          "EndTime": "2020-11-30T10:00:00Z",
          "TargetValue": 10,
          "ScheduleExpression": "cron(0 0 22 * * *)"
        }
      ]
    }
  • Parameters for running the Spark Structured Streaming program
    Parameter setting Description
    Name The name of the scheduled auto-scaling task.
    StartTime The time when the configuration starts to take effect. Specify the value in UTC.
    EndTime The time when the configuration expires. Specify the value in UTC.
    TargetValue The number of instances to be reached.
    ScheduleExpression The scheduled expression that specifies when to run the scheduled task. The following formats are supported:
    • At expressions - "at(yyyy-mm-ddThh:mm:ss)": runs the scheduled task only once. Specify the value in UTC. For example, if scheduling is performed at 04:00 on the 1st of each month in Beijing time, you can use the CRON_TZ=Asia/Shanghai 0 0 4 1 * *.
    • Cron expressions - "cron(0 0 4 * * *)": schedules multiple times, using the standard crontab format. By default, it is run in UTC time, that is, Beijing time minus 8 hours. For example, if the scheduling is performed at 04:00 Beijing time every day and the time is converted to UTC time for scheduling at 20:00 every day, you can use the 0 0 4 * * *.
    The following tables describe the fields and special characters of the CRON expression (Seconds Minutes Hours Day-of-month Month Day-of-week).
    Table 1. Field description
    Field Value range Allowed special character
    Seconds 0 to 59 N/A
    Minutes 0 to 59 , - * /
    Hours 0 to 23 , - * /
    Day-of-month 1 to 31 , - * ? /
    Month 1 to 12 or JAN to DEC , - * /
    Day-of-week 1 to 7 or MON to SUN , - * ?
    Table 2. Special characters
    Character Definitions Examples
    * Indicates any or each. In the Minutes field, 0 indicates that the flow is executed at the 0th second of every minute.
    , Indicates the list value. In the Day-of-week field, MON, WED, and FRI indicate Monday, Wednesday, and Friday.
    - Indicates a range. In the Hours field, 10-12 indicates that the time range is from 10:00 to 12:00 in UTC.
    ? Indicates an uncertain value. This special character is used with other specified values. For example, if you specify a specific date, but you do not care what day of the week it is, you can use this special character in the Day-of-week field.
    / Indicates the increment of a value. For example, n/m means to add an increment m to n each time. In the minute field, 3/5 indicates that the operation is performed every 5 minutes starting from the minute 3.

Metric tracking Auto Scaling

  • Definition: Metric tracking auto-scaling tracks the metrics to dynamically scale reserved instances.
  • Scenario: Function Compute periodically collects the concurrency usage rate of reserved instances, and uses this metric together with the scale-out and scale-in trigger values you configured to control the scaling of reserved instances. In this way, the number of reserved instances can be scaled based on your business needs.
  • Principle: Reserved instances are scaled in or out every minute based on the metric.
    • When the metric exceeds the scale-out threshold, the system scales out the number of instances to the destination value as soon as possible.
    • When the metric is lower than the scale-in threshold, the system slightly scales in the number of instances to the destination value.
    If the maximum and minimum numbers of reserved instances are configured, the system scales the number of reserved instances between the maximum and minimum numbers. If the number of instances reaches the value range, scaling stops.
  • Example of configuration:instance
    • When the traffic increases and the number of required instances reaches 80% of the scale-out threshold, the number of reserved instances starts to scale out till it reaches the scale-out threshold. Requests that cannot be processed by reserved instances are sent to pay-as-you-go instances.
    • When the traffic decreases and the number of required instances reaches 60% of the scale-in threshold, the number of reserved instances starts to scale in.
Statistics only on reserved instances are collected to calculate the concurrency usage rate of reserved instances. The statistics on the pay-as-you-go instances are not included. The metric is calculated based on the following formula: Number of concurrent requests to which reserved instances are responding/Maximum number of concurrent requests to which all reserved instances can respond. The metric value ranges from 0 to 1. The maximum number of concurrent requests to which reserved instances can respond is calculated based on different instance concurrencies:
  • Single request processed by one instance: Maximum concurrency = Number of instances.
  • Multiple requests processed by one instance: Maximum concurrency = Number of instances × Number of requests concurrently processed by one instance.
Scale-in and scale-out values:
  • The values are determined by the current metric value, scaling threshold, number of reserved instances, and scaling factor.
  • Calculation principle: The system scales in based on a scale-in factor. Value range: (0,1]. You do not need to set this factor. The scale-in and scale-out values are rounded-up integers of the following calculation results:
    • Scale-out value = (Current metric value/Scale-out threshold) × Number of reserved instances.
    • Scale-in ratio = (1 - Current metric value/Scale-out threshold) × Scale-in factor.
    • Scale-in value = Current number of instances × (1 - Scale-in ratio).
  • Example: The current metric value is 90%, the scale-out threshold is 80%, and the number of reserved instances is 100. The scale out value = (90%/80%) × 100 = 112.5 (rounded up to 113). The number of reserved instances is increased to 113.
Example of parameter settings:
  • In this example, a function named function_1 in a service named service_1 is configured to automatically scales in and out based on the ProvisionedConcurrencyUtilization metric. Set the scaling period from 10:00:00 on November 1, 2020 to 10:00:00 on November 30, 2020. When the concurrency usage rate exceeds 60%, the number of reserved instances can scale out to 100. When the concurrency usage rate is lower than 60%, the number of reserved instances can scale in to 10.
  • {
      "ServiceName": "service_1",
      "FunctionName": "function_1",
      "Qualifier": "alias_1",
      "TargetTrackingPolicies": [
        {
          "Name": "action_1",
          "StartTime": "2020-11-01T10:00:00Z",
          "EndTime": "2020-11-30T10:00:00Z",
          "MetricType": "ProvisionedConcurrencyUtilization",
          "MetricTarget": 0.6,
          "MinCapacity": 10,
          "MaxCapacity": 100,
        }
      ]
    }
  • The following table describes the parameters.
    Parameter setting Description
    Name The name of the scheduled auto-scaling task.
    StartTime The time when the configuration starts to take effect. Specify the value in UTC.
    EndTime The time when the configuration expires. Specify the value in UTC.
    MetricType The tracked metric: ProvisionedConcurrencyUtilization.
    MetricTarget The tracked value of the metric.
    MinCapacity The maximum scale-out value.
    MaxCapacity The minimum scale-in value.