All Products
Search
Document Center

Function Compute:Technology selection

Last Updated:Apr 01, 2026

Function Compute offers four function types, three runtime environments, and multiple instance types. This topic covers the key differences between each option to help you choose the right combination for your workload.

Quick selection guide

WorkloadFunction typeRuntime
Web apps and REST APIsWeb FunctionCustom Runtime
Event-driven file and stream processingEvent FunctionBuilt-in Runtime
AI inference (computer vision, AIGC)GPU FunctionCustom Container
Long-running and scheduled tasksTask FunctionBuilt-in Runtime
Built-in Runtime and Custom Runtime are both deployed as code packages and are best suited for lightweight applications. GPU Functions support only Custom Container.

Function type selection

Event FunctionWeb FunctionTask FunctionGPU Function
DescriptionProcesses files and data streams triggered by cloud service events, such as OSS triggers, Kafka triggers, and SLS triggers.Supports popular web frameworks. Accessible from a browser or directly via URL.Processes asynchronous requests. Tracks and saves the state of each stage of an asynchronous invocation.Runs container images from popular AI projects such as Stable Diffusion WebUI, ComfyUI, RAG, and TensorRT.
Use casesCloud service integration: Real-time file processing with OSS, log processing with Simple Log Service (SLS). ETL data processing: Database cleaning, message queue processing.Popular web frameworks: SpringBoot, Express, Flask, and more. Migrate existing apps: HTML5 websites, REST APIs, Backend for Frontend (BFF), mobile apps, mini programs, game settlements, and more.General-purpose tasks: Scheduled, periodic, and scripted tasks. Multimedia processing: Video transcoding, live recording, image processing.Traditional inference: Computer vision (CV) and natural language processing (NLP). AIGC model inference: Text-to-text, text-to-image, and text-to-audio generation.
Recommended runtimeBuilt-in RuntimeCustom RuntimeBuilt-in RuntimeCustom Container only
Asynchronous TaskDisabled by defaultDisabled by defaultEnabled by defaultDisabled by default
Best suited forIntegrating with Alibaba Cloud services via event triggersBuilding and migrating web applications and APIsLong-running jobs that need state trackingAI/ML inference workloads requiring GPU acceleration

Runtime environment selection

Built-in RuntimeCustom RuntimeCustom Container
Development workflowWrite a request handler using the interfaces that Function Compute defines.Develop with a web framework template and view the result at a public endpoint instantly.Upload a custom image to Alibaba Container Registry (ACR) and deploy, or use an existing image in ACR.
Supported instance typesCPU instancesCPU instancesCPU instances and GPU instances
Single-instance concurrencyNot supportedSupportedSupported
Cold startFastest — the code package excludes the runtime.Fast — the code package (an HTTP server) is larger, but requires no image pull.Slower — requires pulling an image on cold start.
Code package formatZIP, JAR (Java), or a folderContainer Image
Code package size limit500 MB in select regions (such as Hangzhou); 100 MB in other regions. Use Layers to add dependencies and reduce package size.CPU instance image: 10 GB uncompressed. GPU instance image: 15 GB uncompressed. For AI inference, store large models in NAS or OSS to reduce image size.
Supported languagesNode.js, Python, PHP, Java, C#, GoNo restrictionsNo restrictions
Best suited forLightweight functions using supported languages with the fastest cold startsWeb apps and APIs using any frameworkGPU workloads and containerized deployments

Instance type selection

CPU functions support only Elastic Instances. GPU Functions support three instance types, which you can switch between at any time without service interruption.

Decision guide

Use the following questions to find the right instance type:

  • Is your workload latency-sensitive and interactive? For example, a real-time chatbot or image generation API. If yes, use Provisioned Instances to eliminate cold starts and guarantee response times.

  • Does your traffic follow a predictable baseline with occasional spikes? If yes, use Mixed Mode (Provisioned + Elastic Instances) to maintain stable baseline capacity while absorbing traffic bursts.

  • Is your traffic variable, bursty, or low-frequency? If yes, use Elastic Instances and pay only for active usage.

Instance type comparison

Elastic InstanceProvisioned InstanceProvisioned + Elastic (Mixed Mode)
Applies toCPU functions (only option); GPU FunctionsGPU Functions onlyGPU Functions only
Cold startYes, if minimum instances = 0. Set minimum instances to 1 or more to pre-allocate resources and reduce cold starts.None. All requests within allocated capacity get a real-time response.Partial. Requests within the provisioned pool have no cold start; elastic scale-out instances do.
Billing modelPay-as-you-goSubscriptionSubscription (provisioned portion) + pay-as-you-go (elastic portion)
Best suited forVariable or low-frequency traffic; cost-sensitive workloadsLatency-sensitive or stable traffic workloadsWorkloads with a predictable baseline and unpredictable traffic bursts

Elastic Instance

Elastic Instances scale automatically with request volume and are released when idle. Setting the minimum number of instances to 0 gives you a pure pay-as-you-go model — you pay only for active usage.

Cold start behavior: Cold starts occur when instances scale from zero. To reduce cold start latency, set the minimum number of instances to 1 or more. This pre-allocates elastic resources so instances are ready to handle incoming requests quickly.

Billing: Costs include charges for instances in both the active and Shallow Hibernation states. In Shallow Hibernation, vCPU resources are not charged and GPU resources are billed at one-fifth of the active rate. If you set the minimum number of instances to 1 or more, enable Shallow Hibernation to reduce idle costs.

Use Elastic Instances when:

  • Your traffic is variable, bursty, or low-frequency

  • You want to pay only for actual usage

  • Your workload can tolerate occasional cold start latency (or you mitigate it with a minimum instance count)

Provisioned Instance

Provisioned Instances apply only to GPU Functions. Purchase a Provisioned Resource Pool in advance, then allocate a specific number and type of instances to your function. This eliminates cold starts within your allocated capacity and gives you predictable, fixed costs.

After purchasing a monthly provisioned resource pool, the platform provides an additional boost instance quota at no extra charge.After purchasing a monthly provisioned resource pool, the platform allocates a certain quota of boost instances in addition to your subscription-based provisioned instances. This boost instance quota is not billed.

Cold start behavior: None. All requests within your allocated capacity receive a real-time response. Maximum concurrent requests = (Number of allocated Provisioned Instances) × (Instance concurrency) + boost instance quota+ the boost instance quota. Requests that exceed this limit are throttled.

Billing: The total subscription fee for all purchased Provisioned Resource Pools. Boost instances are not billed.. The boost instance quota is not billed

Provisioned Instances are available only for GPU Functions in the Ada, Ada.2, Ada.3, Hopper, or Xpu.1 series.

Use Provisioned Instances when:

  • Your workload is latency-sensitive and interactive (for example, a real-time chatbot or image generation API)

  • Your traffic is steady and predictable

  • You need guaranteed capacity and consistent response times

Provisioned + Elastic Instances (Mixed Mode)

Mixed Mode applies only to GPU Functions. It combines Provisioned and Elastic Instances: the provisioned pool handles steady-state traffic first, and elastic instances automatically scale out when requests exceed the provisioned capacity. This gives you a guaranteed baseline with the flexibility to absorb sudden traffic bursts.

Cold start behavior: Partial. Requests handled within the provisioned pool have no cold start. Requests that trigger auto-scaling to new elastic instances experience a cold start.

Billing: The provisioned portion is billed against your purchased Provisioned Resource Pool quota. Elastic instances launched beyond the provisioned quota are billed on a pay-as-you-go basis, at the same rates as active and Shallow Hibernation elastic instances.

Use Mixed Mode when:

  • Your traffic has a predictable baseline but occasional spikes

  • You want stable performance for normal load with the ability to handle burst traffic

  • You need a balance between cost predictability and scaling flexibility