Prevent excessive invocations by specifying the maximum number of function instances - Function Compute

To prevent excessive costs from runaway function invocations, each Alibaba Cloud account is limited to a maximum of 100 concurrent instances per region. Function Compute also provides function-level instance limits. These limits prevent a single function from consuming too many instances, which protects your backend resources and helps you avoid unexpected costs.

Scenarios

Protect the normal concurrency of a function.
For example, assume that function-a and function-b share the account-level instance limit. Function-a is a critical business function that requires protection. Function-b might be invoked excessively, which can affect the normal processing of requests for function-a. In this scenario, you can set an instance limit for function-b. This limit prevents function-b from consuming too many instances and ensures that function-a has a sufficient number of instances. You can also set a reserved concurrency for function-a to guarantee that a specific number of instances are allocated to it.
Protect downstream services.
For example, a function needs to frequently access an RDS database that has limited processing capacity. To prevent the RDS database from being overwhelmed, you can set an instance limit for the function.
Block abnormal function invocations.
For example, if you detect abnormal invocations for a function, you can set its maximum number of instances to 0 to block all subsequent invocations.
Prevent excessive function invocations.
For example, user actions in a browser or client can be unpredictable. You can set a function-level instance limit to prevent uncontrolled invocations and avoid unexpected costs.

Limits

By default, a single Alibaba Cloud account is limited to 100 concurrent instances per region. The actual value is displayed in Quota Center. To increase this limit, you can submit a request in Quota Center.
The maximum number of instances that you can set for a single function cannot exceed 90% of the region-level limit. The sum of the instance limits for all functions in a region also cannot exceed 90% of the region-level limit.
For example, if the region-level limit is 300 instances, the maximum number of instances you can set for a single function is 270. If you set the instance limit for one function to 100, the sum of the instance limits for all other functions cannot exceed 170.

Set the maximum number of elastic instances for a function

Manage a single function

Log on to the Function Compute console. In the navigation pane on the left, choose Function Management > Functions.
In the top menu bar, select a region. On the Functions page, click the target function.
Choose the Scaling configuration tab. In the Instance Limit section, click Modify.
In the Edit Quota panel, enter a value for Elastic instance quota, and then click OK.

Manage multiple functions

Log on to the Function Compute console. In the navigation pane on the left, choose Elastic strategy > Function Quota.
On the Function Quota tab, click Create Quotas. In the Create Quotas panel, select the target functions, enter a value for Elastic instance quota, and then click OK.

Resource quota locking for GPU functions

After you set the maximum number of elastic instances for a function, the system locks a portion of the total resource quota for that function. These locked resources can be used only by that function. For example, consider the GPU card resources for a GPU function. If you set the Elastic instance quota to 2, the following table describes the GPU card resource quota that is locked for the function in different scenarios:

Note

For more information about the total quotas for different series of GPU cards, see Limits.

Scenario Example	Quota Locking Logic	Actual Settings	Function Quota Locked
Single version, single card type	Elastic Instance Quota × Cards per instance	Single Tesla series card	2 Tesla cards locked
Multiple versions, same card type	For each version, calculate `Elastic Instance Quota × Cards for that version`. The final value is the maximum of the results.	Version 1: Single Tesla series card Version 2: 8 Tesla series cards	16 Tesla cards locked
Multiple versions, different card types	For each card type, independently calculate `Elastic Instance Quota × Maximum cards among all versions of that type`. The quotas for different card types are locked separately.	Version 1: Single Tesla card Version 2: Single Ada.1 series card Version 3: 8 Ada.1 series cards	2 Tesla cards + 16 Ada.1 cards locked