All Products
Search
Document Center

Best practice for reducing the cold start latency

Last Updated: Mar 06, 2020

Reduce the cold start latency

Function Compute supports two types of instances: pay-as-you-go instances and reserved instances. Pay-as-you-go instances are automatically allocated and released by Function Compute. They are charged based on the amount of time that it takes the instances to process requests. Using pay-as-you-go instances saves you the trouble of managing and allocating resources. However, it means that you cannot avoid using cold start. This poses an issue of function invocation latency, which has negative impacts on the performance of your applications.

Cold start procedure of pay-as-you-go instances

As shown in this figure, the entire cold start process includes multiple steps. Code downloading, container startup, runtime initialization, and user code initialization can be time-consuming. After an instance is launched with a cold start, it is ready to call the requested functions. Cold start latency optimization is a shared responsibility, which requires the efforts from both developers and platforms. Function Compute has made a great number of optimizations to reduce the cold start latency at the platform side. For developers at the user side, we recommend that they make the following optimizations:

  1. Reduce the size of code packages: Remove extraneous dependencies. This can help you reduce the time that it takes to download and extract the code packages. For example, run the npm prune command to remove extraneous dependencies for Node.js functions, or run the autoflake and autoremove commands to remove extraneous dependencies for Python functions. Some third-party libraries may contain the source code of test cases, extraneous binary files, or other data. You can choose to delete these files to speed up code package downloading and extraction.
  2. Select an appropriate programming language: In most cases, Java needs more time to launch the runtime with a cold start than other programming languages. In a scenario where low response latency is required, but using hot start can barely shorten the latency, you can choose a lightweight programming language such as Python. This can significantly reduce the long tail latency, which is in comparison to the average latency.
  3. Allocate more memory resources: You can allocate more memory resources to speed up a cold start.
  4. Reduce the times of cold starts:
    • Use function triggers to prepare compute resources for incoming requests at a specified time.
    • Use initializers to define the logic for user code initialization. This allows Function Compute to initialize the user code according to the predefined logic when it launches instances. Initializers are suitable for system upgrading or function updating, which makes the cold start process transparent to users.

Use reserved instances to optimize function invocation

In most cases, you cannot avoid using cold start at the user side. For example, you may need to load gigabytes of model files for deep learning inference. You may also need to call functions to interact with legacy software from a client whose initialization process is time-consuming. To reduce the function invocation latency in these scenarios, you can use reserved instances, or use both reserved instances and pay-as-you-go instances simultaneously. Reserved instances are allocated and released by users, and are billed based on the uptime of the instances. When the workloads require more resources than the reserved instances can offset, the system adds pay-as-you-go instances. This can help you balance performance and resource utilization. Reserved instances are ready-to-use compute resources and can be used together with pay-as-you-go instances. Function Compute automatically adds pay-as-you-go instances as an addition when the reserved instances cannot handle all the requests. In this way, you can reduce the cold start latency posed by using only pay-as-you-go instances. The following figure shows a scenario where both reserved and pay-as-you-go instances are used to reduce the cold start latency while still maintaining high resource utilization.

By default, reserved instances are prioritized over pay-as-you-go instances when both types of instances are used for function invocation. For example, the overheads required per second for function invocation are equivalent to 10 capacity units (10 instances). Function Compute first uses reserved instances to offset the resource costs. When more instances are required, Function Compute automatically adds pay-as-you-go instances. The maximum load of an instance corresponds to the concurrent request throttling setting of the instance. Function Compute tracks the number of requests being processed on each instance. When the number of requests being concurrently processed on an instance reaches the specified upper limit, the system sends the subsequent requests to another instance. When all the instances reach the upper limit, Function Compute starts adding pay-as-you-go instances. Reserved instances are managed by users. They are continuously charged even if they are idle. Pay-as-you-go instances are managed by Function Compute and released when they are idle. Pay-as-you-go instances are charged based on the amount of time that it takes the instances to process requests. For more information about the pricing details, see Pricing. Function Compute allows you to throttle the costs spent on compute resources by specifying the maximum number of pay-as-you-go instances that can be created.

The Active Instance Count indicator is used in this case to help you control the amount of resources used for function invocation. This indicator represents the total number of reserved and pay-as-you-go instances that are processing requests. For more information, see Indicators. As shown in the following figure, by comparing the number of active instances with the number of reserved instances, you can determine whether the reserved instances are sufficient to offset the resource costs.

Active and reserved instance comparison

Summary

Function Compute provides a wide array of approaches and indicators to help you reduce the cold start latency and improve the stability and overall performance of your applications.