This page covers the core concepts in Function Compute (FC), organized around the three stages of working with an FC function: creating, configuring, and invoking it.
An FC function is the basic unit for resource scheduling and execution in Function Compute. Each function consists of function code and a function configuration.
Create a function
When you create a function, select a function type and a runtime environment, then upload your function code. You can adjust other configuration parameters after creation.
For guidance on which approach fits your use case, see Technical selection guide.
Function types
Event function
An event function responds to triggers from Alibaba Cloud services such as Object Storage Service (OSS), Kafka, and Simple Log Service (SLS). Write your handler based on the interfaces defined by Function Compute. See Create an event-triggered function.
Web function
A web function uses a popular web framework — Flask, ThinkPHP, Express, or SpringBoot — and exposes an HTTP interface. Write your handler based on the framework's interfaces. See Create a web function.
Task function
A task function processes asynchronous requests in task mode. The system tracks the status of each task, and you can manually start or stop tasks. This function type suits offline workloads such as scheduled jobs, audio and video processing, and batch data processing. See Create a task function.
GPU function
A GPU function runs GPU-accelerated workloads such as Stable Diffusion WebUI, ComfyUI, RAG, or TensorRT. Deploy these projects as container images. See Create a GPU-accelerated function.
Runtime environment
The runtime environment determines how Function Compute executes your function code.
Built-in runtime
A built-in runtime (also called a predefined runtime) is preconfigured by the Function Compute platform. Write your function handler based on the interfaces defined by Function Compute. Built-in runtimes offer the fastest cold starts but do not support multiple concurrent requests per instance. They are best suited for event-triggered scenarios using OSS, Kafka, or SLS triggers. See Function Compute runtimes.
Event (`event`): Event data is passed to your function as a JSON document. The runtime converts it into an object and passes it to the
eventparameter of your function handler. If the event originates from another Alibaba Cloud service, its format follows that service's specification. See Trigger event formats.Context (`context`): When FC runs your function, it passes a context object to the
contexttracing parameter of your function handler. This object contains information about the invocation, service, function, and execution environment. See Context.
Custom runtime
A custom runtime supports mainstream web frameworks such as Flask, ThinkPHP, Express, and SpringBoot. Your deployment package is a ZIP file containing an HTTP server program. Set the Start Command and Start Arguments in the function configuration to launch the HTTP server. See How it works.
Custom image
The custom image feature lets you use a container image as the delivery artifact for your function. Upload a custom image to Container Registry (ACR) or use a sample image. Custom images are the only runtime type that supports GPU functions. See Create a function that uses a custom image.
Instance types
Provisioned instance
A provisioned instance is bound to a GPU function when you purchase a provisioned resource pool. Provisioned resource pools are available only for GPU functions and billed by monthly subscription. This billing model lets you reserve GPU resources in advance to ensure stable operations. After binding, allocate provisioned instances to handle incoming requests.
Elastic instance
An elastic instance scales automatically based on invocation volume. FC creates instances as traffic increases and destroys them as it drops. The first invocation after a scale-to-zero event must wait for a cold start.
Elastic instance (light hibernation (formerly idle))
When the minimum instance count is set to 1 or higher and the light hibernation switch is enabled, idle instances automatically enter light hibernation. In this state, the system freezes some instance resources and charges only a minimal keep-alive fee. When a new request arrives, the frozen resources are restored and the instance becomes active. This hot start typically takes more than 2 seconds, depending on model size.
Elastic instances enter the light hibernation billing state in these scenarios:
The instance is idle after you set the minimum instance count to 1 or higher and enable the light hibernation switch.
The instance is in its keep-alive period in a session affinity scenario.
The instance is not processing background tasks in a background task scenario.
Elastic instance (active)
Elastic instances enter the active billing state in these scenarios:
The instance is started by a request when the minimum instance count is not set (default behavior: scale to zero when idle).
The instance becomes active to handle requests when the minimum instance count is set to 1 or higher, regardless of whether the light hibernation switch is enabled.
The instance is processing requests in a session affinity scenario.
The instance is processing background tasks in a background task scenario.
Minimum instance count
Setting the minimum instance count to 1 or higher eliminates cold starts for the first request on an elastic instance and guarantees reserved compute capacity. The default value is 0, meaning instances scale to zero when idle.
For services that depend on session affinity — such as WebSocket and gRPC — a minimum instance count of 1 or higher enables session affinity scheduling and persistent connections, which keeps real-time interactions stable.
Configure scheduled or metric-based scaling to dynamically adjust the minimum instance count: raise it during peak periods or when a metric threshold is reached, and lower it when the load drops to maximize resource efficiency. See Instance scaling limits and elastic policies.
Cold start reference
The following table summarizes how each instance model affects cold start behavior, to help you choose the right approach:
| Instance model | Cold start behavior | Best for |
|---|---|---|
| Elastic instance (minimum count = 0) | Cold start on every scale-from-zero event | Cost-sensitive, latency-tolerant workloads |
| Elastic instance (minimum count >= 1, light hibernation enabled) | Hot start (typically > 2 s) when restoring from hibernation | GPU inference, large model loading |
| Elastic instance (minimum count >= 1, light hibernation disabled) | No cold start; instance stays active | Low-latency services, WebSocket, gRPC |
| Provisioned instance | No cold start; GPU resources reserved permanently | Production GPU workloads requiring consistent availability |
Configure a function
After creating a function, adjust the following configuration items as needed.
Basic configurations
Instance type
FC provides multiple instance types with different resource specifications. See Instance types.
Temporary disk
Each function instance has a temporary storage disk mounted to the instance's root directory. The disk contents are cleared when the instance is reclaimed. For persistent file storage, mount a NAS file system or an OSS file system instead.
Full-card GPU instances (Tesla series with 16 GB, Ada series with 48 GB) support disk sizes of 30 GB or 60 GB. Other instance types support 512 MB or 10 GB. The 512 MB disk size is free of charge.
Trigger
Some Alibaba Cloud services can invoke FC functions directly through triggers. When a specific event occurs, the service pushes the event to FC and the function is invoked immediately. A single function can have multiple triggers, each acting as an independent client. Each event FC passes to your function contains data from exactly one trigger. See Introduction to triggers.
Runtime
FC supports multiple programming languages through runtimes. A runtime provides a language-specific execution environment that relays invocation events, context information, and responses between FC and your function. See Introduction to runtimes.
Environment variable
Environment variables are stored as string key-value pairs in the function configuration. Each function has its own independent set of environment variables. Use environment variables to change function behavior without modifying code. See Configure environment variables.
Layer
A layer is a .zip file that contains additional code or other content — typically libraries, a custom runtime, data files, or configuration files. FC provides official public layers and supports creating custom layers.
Using layers provides three benefits:
Smaller deployment packages: Moving dependencies into a layer reduces the size of your function's deployment package and speeds up code deployment.
Separation of concerns: Update function dependencies without touching function code, and vice versa.
Shared dependencies: Add a single layer to any number of functions in your account instead of bundling the same dependencies into every deployment package.
Functions using a custom image (Custom Container) do not support layers. Package your runtime, libraries, and other dependencies directly into the container image when you create a function that uses a custom image.
Permissions
Grant a function access to other Alibaba Cloud services by assigning a Resource Access Management (RAM) role. FC uses the RAM role to generate temporary credentials and passes them to your function code. See Use a function role to grant a function permissions to access other Alibaba Cloud services.
Logs
FC integrates with Simple Log Service (SLS). After configuring logging, FC automatically collects function logs and delivers them to the specified Logstore. See Configure the logging feature.
Network
By default, a function can access the internet but cannot access resources within a Virtual Private Cloud (VPC). To access VPC resources or allow a specific VPC to invoke the function, configure network settings and permissions for that function. See Configure network settings.
Storage
FC supports mounting Apsara File Storage NAS file systems and OSS buckets. See Configure a NAS file system and Configure OSS access.
Asynchronous configuration
FC executes asynchronous requests in task mode. In task mode, the system records the execution status of each task at every stage and provides task status queries, task queue metrics, task deduplication, and proactive task termination.
Asynchronous task mode suits long-running workloads. It is not suitable for latency-sensitive processing (response times under 100 ms) or workloads that continuously submit thousands or more tasks per second. See Asynchronous invocation.
Lifecycle
Function instances are dynamically created and destroyed based on real-time request volume. Each function instance goes through three lifecycle stages: Creating, Invoke, and Destroy. See Configure an instance lifecycle hook.
Health check
FC supports periodic health checks for web function and GPU function instances. Health checks prevent requests from being routed to unhealthy instances, reducing request failures. See Configure a health check for an instance.
DNS
The custom DNS feature accelerates site access. It supports only built-in runtimes and custom runtimes. See Configure custom DNS.
Custom domain name
Bind a custom domain name to a function or application to access it through a stable URL. The custom domain name can also serve as an origin for a CDN-accelerated domain, reducing access latency for your users. See Configure a custom domain name.
Invoke a function
After deploying an FC function, invoke it in multiple ways: use the Function Compute console to test with a sample event, call it via SDK or API, use a function URL (HTTP or HTTPS endpoint), or trigger it from an event source. The following sections describe concepts related to function invocation.
Synchronous invocation
FC processes the event and returns the result immediately. See Synchronous invocations.
Asynchronous invocation
FC accepts the event and returns a response immediately, without waiting for the background task to finish. The system reliably processes the event but does not return specific invocation details or function execution status. To retrieve the result of an asynchronous invocation, configure an asynchronous invocation destination. See Feature overview.
Invocation analysis
The invocation analysis feature summarizes execution status at the request level. When enabled, FC collects metric information for each function execution. See Request-level metric logs.
Maximum instance count
The maximum instance count limits how many instances the function can run simultaneously. The default maximum is 100 instances per Alibaba Cloud account per region. The actual limit is shown in Quota Center. To raise the limit, submit a request through Quota Center.
Concurrency per instance
Concurrency per instance is the number of requests a single instance can process simultaneously. When creating a function with a custom runtime or a container image, configure multiple concurrent requests per instance to reduce execution time, lower the total number of instances, and improve resource utilization. See Configure concurrency.
Other concepts
Version
Publishing a version saves the current function code and configuration as an immutable baseline. This baseline excludes resource properties such as triggers, asynchronous task configurations, and elastic policies. A version is analogous to a git commit: it captures a snapshot of your code and configuration at a point in time. See Version management.
Alias
An alias is a pointer to a specific function version. When a function is invoked through an alias, FC resolves it to the target version transparently — callers do not need to know which version the alias points to. Use aliases to implement canary releases, rollbacks, and staged deployments. An alias is analogous to a git tag that marks a commit for release. See Alias management.
Tag
Tags categorize function resources for easier searching and aggregation. Use tags to group functions and assign different operational permissions to different roles for each group. See Configure tags.