To meet user needs in different scenarios, Function Compute provides four types of functions: event functions, Web functions, task functions, and GPU functions. To accommodate different development workflows, Function Compute offers three types of runtime: built-in runtimes, custom runtimes, and custom images. To account for different resource utilization rates and billing preferences, Function Compute provides two instance types: elastic instances and provisioned instances. This topic describes the features of Function Compute and their applicable scenarios to help you select the appropriate technology.
Selection overview
When you use Function Compute, you can select the appropriate function type and runtime based on your business scenario and technology stack, and select an instance type to optimize performance and cost.
Web applications and API services: Use Web functions with a Custom Runtime. This function type supports many popular web application frameworks and can be accessed from a browser or invoked by a URL.
File processing and data stream processing: Use event functions with a Built-in Runtime. You can configure event triggers to integrate with various Alibaba Cloud products, such as Object Storage Service, ApsaraMQ for RocketMQ, and Simple Log Service (SLS).
AI inference scenarios such as Chatbot and text-to-image: Use GPU functions with a Custom Image. You can quickly build AI model inference services based on container images from popular AI projects, such as ComfyUI, RAG, and TensorRT.
Asynchronous tasks: Use task functions with a Built-in Runtime for scenarios such as scheduled tasks and video transcoding.
Both Built-in Runtimes and Custom Runtimes are deployed to functions as code packages and are suitable for lightweight applications.
For containerized deployments, use a Custom Image. GPU Functions support only Custom Images.
Function type selection
Comparison item | Event Function | Web Function | Task Function | GPU Function |
Features | Process files and data streams. Can be triggered by events from various cloud products, such as OSS triggers, Kafka triggers, and SLS triggers. | Supports popular web application frameworks. Can be accessed from a browser or invoked by a URL. | Processes asynchronous requests. Can track and save the state of each stage of an asynchronous invocation. | Supports container images from popular AI projects, such as Stable Diffusion WebUI, ComfyUI, RAG, and TensorRT, to quickly build AI model inference services. |
Scenarios |
|
|
|
|
Runtime | Recommended: built-in runtime | Recommended: custom runtime | Recommended: built-in runtime | Supports only custom images |
Disabled by default | Disabled by default | Enabled by default | Disabled by default |
Runtime selection
Comparison item | Built-in runtime | Custom runtime | Custom image |
Development workflow | Write a handler based on the interfaces defined by Function Compute. | Develop a web application based on a framework template and view the result immediately at a public endpoint. | Upload a custom image to ACR and then use the image, or use an existing image in ACR. |
Supported instance types | CPU instances | CPU instances | CPU instances and GPU-accelerated instances |
Not supported | Supported | Supported | |
Fastest. The code package does not include the runtime, resulting in the fastest cold start. | Fast. The code package is an HTTP Server. It is large but does not require pulling an image, resulting in a fast cold start. | Slow. Pulling an image is required, which results in a slow cold start. | |
Code package format | ZIP, JAR (Java), or folder | Container image | |
The maximum size is 500 MB in some regions (such as Hangzhou) and 100 MB in other regions. Note You can configure layers to add dependencies and reduce the code package size. |
Note For AI inference applications, you can store large models in NAS or OSS to reduce the image size. | ||
Supported programming languages | Node.js, Python, PHP, Java, C#, Go | No limit | No limit |
Instance type selection
CPU functions support only elastic instances. For GPU functions, you can choose between elastic instances and provisioned instances as needed for resource utilization, latency, and cost stability. For a detailed selection guide, see the following flowchart.
You can bind provisioned instances only to GPU functions that belong to the Ada, Ada.2, Ada.3, Hopper, or Xpu.1 series.
Elastic instances
If you set the minimum number of instances for a function to 0, instances automatically scale based on the request volume and are released when there are no requests. This means you are billed based on usage and pay nothing when the function is not in use, which maximizes cost savings. The more frequent the business requests, the higher the resource utilization and the greater the cost savings compared to using elastic virtual machines.
Are there cold starts?
Yes. For latency-sensitive businesses, you can set the minimum number of instances to 1 or more to mitigate cold starts. This method pre-allocates elastic resources. When a request arrives, the instance is quickly activated to execute the request.
Billing (Pay-as-you-go)
The usage cost of a function is the sum of the fees for active elastic instances and shallow hibernation (formerly idle) elastic instances. If you set the minimum number of instances to 1 or more, you can enable the shallow hibernation mode. In the shallow hibernation state, vCPU usage is free, and GPU usage is billed at only 20% of the regular rate. This cost is much lower than that of active elastic instances.
For more information about the scenarios for active and shallow hibernation elastic instances, see Elastic instances.
Provisioned instances
This instance type applies only to GPU functions. You can purchase a provisioned resource pool in advance and then allocate a specific number and type of provisioned instances to a function from the resource pool. This method provides predictable and fixed usage costs and is suitable for scenarios with high resource utilization, strict latency requirements, or a need for stable costs.
Are there cold starts?
No. When you use provisioned instances, the maximum number of requests that a function can process simultaneously is determined by the following formula: = Number of allocated provisioned instances × Instance concurrency. Requests that exceed this limit are throttled. Requests within the limit receive a real-time response, which completely eliminates cold starts.
Billing (Subscription)
The function cost is the total subscription fee for all purchased provisioned resource pools.