All Products
Search
Document Center

Function Compute:Configure provisioned instances

Last Updated:Apr 10, 2024

Provisioned instances help reduce request latencies caused by instance cold starts during peak hours. In addition, you can configure an auto scaling policy, such as a scheduled scaling policy or water-level scaling policy, for provisioned instances to improve resource utilization and prevent resource waste.

Limits

The following table shows the limits on the scale-out rate of provisioned instances in different regions.

Region

Upper limit of burst instances

Upper limit of instance growth rate

China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), and China (Shenzhen)

300

300 per minute

Other regions

100

100 per minute

Note

If you want to increase the upper limit on the speed of instance scale-out, join the DingTalk user group 11721331 for technical support.

Configure provisioned instances

Step 1: Create a provisioned instance policy

You can create a provisioned instance policy by using one of the following methods:

  • Configure a provisioned instance policy on the Function Details tab of the function details page. This method is used in this topic.

  • Choose Advanced Features > Auto Scaling and create a provisioned instance policy on the Provisioned Instance Policy tab.

  1. Log on to the Function Compute console. In the left-side navigation pane, click Function.

  2. In the top navigation bar, select a region. On the Function page, click the name of the function that you want to manage.

  3. On the Function Details tab, click the Configuration tab.

  4. In the left-side navigation tree, click the Provisioned Instances tab and click Create Provisioned Instance Policy.

  5. In the Create Provisioned Instance Policy panel, configure the parameters and click OK.

    Parameter

    Description

    Version or Alias

    Select the version or alias for which you want to create a provisioned instance policy.

    Note

    You can create a provisioned instance policy only for the LATEST version.

    Provisioned Instances

    Specify the number of provisioned instances.

    Note

    The minimum number of provisioned instances help quickly respond to function invocation requests, reduce cold starts, and improve service performance for online applications that are sensitive to response latency. Take note that you are charged for these instances even if they do not process any request unless you release them.

    (Optional) Scheduled Scaling: You can select this option to configure a scheduled scaling policy which scales the number of function instances at a specified point in time. For more information about the scenario and configuration example, see Scheduled scaling.

    Policy Name

    Enter a custom policy name.

    Provisioned Instances

    Specify the number of instances that you want to scale out.

    Note

    After you configure this parameter, its value overrides the Provisioned Instances value configured earlier in this section.

    Trigger Mode

    You can select At Time Points or Custom CRON expression.

    • At Time Points: Specify the Time (UTC), Date (UTC), and Week Day (UTC) parameters as prompted.

    • Custom CRON expression: Specify the Schedule Expression (UTC) parameter. In this example, cron(0 0 4 * * *) is configured to trigger scaling at 12:00 (UTC+8) every day.

    Note

    Because the time setting must be in UTC time, the time is set to 4:00 in this example.

    Effective Time (UTC)

    Set the time when the scaling configurations start to take effect and the time when the scaling configurations expire.

    (Optional) Water-level Scaling: You can select this option to scale function instances every minute based on the concurrency usage metric or another usage metric. For more information about the scenario and configuration example of water-level scaling, see Water-level scaling.

    Policy Name

    Enter a custom policy name.

    Minimum Number of Provisioned Instances

    Specify the minimum number of instances that you want to scale.

    Maximum Number of Provisioned Instances

    Specify the maximum number of instances that you want to scale.

    Utilization Type

    Note

    This parameter is displayed only when GPU-accelerated instances are configured.

    Select a metric based on which instances are scaled.

    Concurrency Usage Threshold/Usage Threshold

    Configure a usage threshold.

    • If the usage of a metric for an instance or the concurrent usage falls below the configured threshold, a scale-in is triggered.

    • If the usage of a metric for an instance or the concurrent usage reaches or exceeds the configured threshold, a scale-out is triggered.

    Effective Time (UTC)

    Set the time when the scaling configurations start to take effect and the time when the scaling configurations expire.

    After the policy is created, you can view the policy in the policy list of the function.

Step 2: Verify the policy

You can check whether the configured policy takes effect by checking the number of provisioned instances in the monitoring data when specific condition is fulfilled.

  1. On the Function Details tab, click the Monitoring tab.

  2. On the Function Metrics tab, view the data in the Function Provisioned Instances (count) card to check whether the policy takes effect.

    image

Modify or delete a provisioned instance policy

On the Configuration tab of the Function Details tab, click the Provisioned Instances tab in the left-side navigation tree to view created policies. Click Modify or Delete in the Actions column to modify or delete the corresponding policy.

References

  • For more information about the basic concepts and billing methods of the on-demand mode and provisioned mode, see Instance types and usage modes.

  • For more information about the limits, behaviors, and scaling rules of function instances in on-demand mode and provisioned mode, see Instance scaling limits and rules.

  • By default, all functions within an Alibaba Cloud account in the same region share the preceding scaling limits. To limit the number of instances for a function, you can configure the maximum number of concurrent instances. For more information, see Configure an upper limit for concurrent instances. After you configure the maximum number of concurrent instances, Function Compute returns a throttling error when the total number of running instances for the function exceeds the specified limit.