Technology selection guide - Function Compute - Alibaba Cloud Documentation Center

To meet user needs in different scenarios, Function Compute provides four types of functions: event functions, Web functions, task functions, and GPU functions. To accommodate different development workflows, Function Compute offers three types of runtime: built-in runtimes, custom runtimes, and custom images. To account for different resource utilization rates and billing preferences, Function Compute provides two instance types: elastic instances and provisioned instances. This topic describes the features of Function Compute and their applicable scenarios to help you select the appropriate technology.

Selection overview

When you use Function Compute, you can select the appropriate function type and runtime based on your business scenario and technology stack, and select an instance type to optimize performance and cost.

Web applications and API services: Use Web functions with a Custom Runtime. This function type supports many popular web application frameworks and can be accessed from a browser or invoked by a URL.
File processing and data stream processing: Use event functions with a Built-in Runtime. You can configure event triggers to integrate with various Alibaba Cloud products, such as Object Storage Service, ApsaraMQ for RocketMQ, and Simple Log Service (SLS).
AI inference scenarios such as Chatbot and text-to-image: Use GPU functions with a Custom Image. You can quickly build AI model inference services based on container images from popular AI projects, such as ComfyUI, RAG, and TensorRT.
Asynchronous tasks: Use task functions with a Built-in Runtime for scenarios such as scheduled tasks and video transcoding.

Note

Both Built-in Runtimes and Custom Runtimes are deployed to functions as code packages and are suitable for lightweight applications.
For containerized deployments, use a Custom Image. GPU Functions support only Custom Images.

Function type selection

Comparison item	Event Function	Web Function	Task Function	GPU Function
Features	Process files and data streams. Can be triggered by events from various cloud products, such as OSS triggers, Kafka triggers, and SLS triggers.	Supports popular web application frameworks. Can be accessed from a browser or invoked by a URL.	Processes asynchronous requests. Can track and save the state of each stage of an asynchronous invocation.	Supports container images from popular AI projects, such as Stable Diffusion WebUI, ComfyUI, RAG, and TensorRT, to quickly build AI model inference services.
Scenarios	Cloud product integration: Real-time file processing with OSS, log processing with SLS, and more. ETL data transformation: Database data cleaning, message queue processing, and more.	Quickly build applications with popular web frameworks: SpringBoot, Express, Flask, and more. Migrate existing applications: HTML5 websites, REST APIs, BFF, mobile apps, miniapps, game settlements, and more.	Regular tasks: Scheduled tasks, auto triggered tasks, script tasks, and more. Multimedia processing: Video transcoding, live recording, image processing, and more.	Traditional online inference: CV visual recognition, NLP language processing, and more. AIGC model inference: Text-to-text, text-to-image, text-to-audio, and more.
Runtime	Recommended: built-in runtime	Recommended: custom runtime	Recommended: built-in runtime	Supports only custom images
Asynchronous task	Disabled by default	Disabled by default	Enabled by default	Disabled by default

Runtime selection

Comparison item	Built-in runtime	Custom runtime	Custom image
Development workflow	Write a handler based on the interfaces defined by Function Compute.	Develop a web application based on a framework template and view the result immediately at a public endpoint.	Upload a custom image to ACR and then use the image, or use an existing image in ACR.
Supported instance types	CPU instances	CPU instances	CPU instances and GPU-accelerated instances
Single-instance concurrency	Not supported	Supported	Supported
Cold start	Fastest. The code package does not include the runtime, resulting in the fastest cold start.	Fast. The code package is an HTTP Server. It is large but does not require pulling an image, resulting in a fast cold start.	Slow. Pulling an image is required, which results in a slow cold start.
Code package format	ZIP, JAR (Java), or folder		Container image
Code package size limit	The maximum size is 500 MB in some regions (such as Hangzhou) and 100 MB in other regions. Note You can configure layers to add dependencies and reduce the code package size.		The uncompressed size of a CPU instance image cannot exceed 10 GB. The uncompressed size of a GPU-accelerated instance image cannot exceed 15 GB. Note For AI inference applications, you can store large models in NAS or OSS to reduce the image size.
Supported programming languages	Node.js, Python, PHP, Java, C#, Go	No limit	No limit

Instance type selection

CPU functions support only elastic instances. For GPU functions, you can choose between elastic instances and provisioned instances as needed for resource utilization, latency, and cost stability. For a detailed selection guide, see the following flowchart.

Note

You can bind provisioned instances only to GPU functions that belong to the Ada, Ada.2, Ada.3, Hopper, or Xpu.1 series.

Elastic instances

If you set the minimum number of instances for a function to 0, instances automatically scale based on the request volume and are released when there are no requests. This means you are billed based on usage and pay nothing when the function is not in use, which maximizes cost savings. The more frequent the business requests, the higher the resource utilization and the greater the cost savings compared to using elastic virtual machines.

Are there cold starts?

Yes. For latency-sensitive businesses, you can set the minimum number of instances to 1 or more to mitigate cold starts. This method pre-allocates elastic resources. When a request arrives, the instance is quickly activated to execute the request.

Billing (Pay-as-you-go)

The usage cost of a function is the sum of the fees for active elastic instances and shallow hibernation (formerly idle) elastic instances. If you set the minimum number of instances to 1 or more, you can enable the shallow hibernation mode. In the shallow hibernation state, vCPU usage is free, and GPU usage is billed at only 20% of the regular rate. This cost is much lower than that of active elastic instances.

For more information about the scenarios for active and shallow hibernation elastic instances, see Elastic instances.

Provisioned instances

This instance type applies only to GPU functions. You can purchase a provisioned resource pool in advance and then allocate a specific number and type of provisioned instances to a function from the resource pool. This method provides predictable and fixed usage costs and is suitable for scenarios with high resource utilization, strict latency requirements, or a need for stable costs.

Are there cold starts?

No. When you use provisioned instances, the maximum number of requests that a function can process simultaneously is determined by the following formula: = Number of allocated provisioned instances × Instance concurrency. Requests that exceed this limit are throttled. Requests within the limit receive a real-time response, which completely eliminates cold starts.

Billing (Subscription)

The function cost is the total subscription fee for all purchased provisioned resource pools.