Use the delayed release feature for background asynchronous tasks, such as uploading logs or synchronizing data after a request is processed. This feature keeps an instance alive to complete these tasks, which prevents interruptions and data loss from premature instance termination.
After you enable this feature, Function Compute keeps an instance running for the delayed release duration that you set after the instance processes its last request. During this period, the system automatically transitions the instance between different states based on its resource utilization to balance performance and cost:
Remain active: The instance remains active if its vCPU utilization or GPU utilization (stream processors and decoders) is higher than the system threshold. This allows background tasks to continue processing.
Switch to idle: The instance automatically switches to an idle state to reduce costs when both its vCPU and GPU utilization are lower than the system threshold.
Quick wakeup: An idle instance quickly wakes up when it receives a new request. This provides a hot start in milliseconds and avoids cold start latency.
If no requests arrive during the delayed release duration, the instance is automatically destroyed, and billing stops. After the instance is destroyed, a cold start occurs if the configured minimum number of instances is 0. If the configured minimum number of instances is greater than 0, cold starts can be eliminated.
Comparison of delayed release solutions
Item | Delayed release | Session affinity | Delayed release + Session affinity |
Scenarios | Background services | Services with long-lifecycle sessions | Background services and services with persistent session connections |
Instance keepalive duration limit | 5 minutes ≤ Configured delayed release duration ≤ 60 minutes | The latest session expiration time on a single instance | The greater of the following two durations:
|
How to configure this feature in Function Compute?
Usage notes
The delayed release feature applies only to elastic instances that are dynamically created in response to requests.
This feature does not affect your configured minimum number of instances. Minimum instances have their own lifecycle management and billing rules and are not affected by the delayed release configuration.
Step 1: Configure delayed release for elastic instances
You can configure the delayed release feature when you create a function or follow these steps to configure the feature for an existing function.
Log on to the Function Compute console. In the navigation pane on the left, choose .
In the top navigation bar, select a region. On the Functions page, click the name of the target function.
Click the tab, and then click Edit next to Advanced Configuration.
In the Advanced Configuration panel, expand the Delayed Release for Elastic Instances section, turn on the Delayed Release for Elastic Instances switch, set Delayed Release Time, and then click Deploy.
(Optional) Step 2: Configure session affinity
This topic uses the HeaderField affinity feature as an example. For more information, see Session affinity configuration procedures.
Step 3: Verify the result
On the details page of the target function, click the Code tab, and then click Test Function.
After the function is executed, click the Instances tab. Check the Lifecycle of the instance to verify that the delayed release feature is active.
Billing
After you enable the delayed release feature, billing applies to the instances that are dynamically created in response to requests. The billing rules described below apply only to these instances and are independent of the billing for your configured minimum number of instances.
Minimum number of instances: These instances persist regardless of whether there are requests. They are billed at the rate for idle elastic instances when there are no requests and at the rate for active elastic instances when processing requests.
Instances with delayed release: After requests are processed, these instances enter an active, idle, or destroyed state according to the rules described in this topic.
Scenario 1: Only the delayed release feature is configured
Example
The instance delayed release duration is set to 9 minutes.
Billing periods
As shown in the following figure, billing is divided into three periods based on request execution. After 14 minutes, the instance is destroyed because no new requests have arrived for 9 minutes.
00:00 to 00:05: While processing requests, the instance is billed at the rate for active elastic instances.
00:05 to 00:09: After the requests are processed, the system detects that the vCPU or GPU utilization is greater than the system threshold. The instance is billed at the rate for active elastic instances.
00:09 to 00:14: After the requests are processed, the system detects that both the vCPU and GPU utilization are less than the system threshold. The instance is billed at the rate for idle elastic instances.
Scenario 2: Both delayed release and session affinity features are configured
Example
The instance delayed release duration is set to 9 minutes.
HeaderField affinity is enabled and Session Idle Duration is set to 15 minutes.
This topic uses HeaderField affinity as an example. For more information about the types, features, and billing of session affinity, see Configure session affinity.
Billing periods
The system uses a 15-minute period, which is the greater of the instance delayed release duration (9 minutes) and the Session Idle Duration (15 minutes) for the HeaderField affinity feature. Billing is divided into three periods based on request execution, similar to the Scenario 1: Only the delayed release feature is configured scenario. After 20 minutes, the instance is destroyed if no new requests are received.