All Products
Search
Document Center

Function Compute:Create a GPU function

Last Updated:Dec 07, 2025

You can deploy applications that require GPU-accelerated instances as functions using container images. This approach is ideal for popular AI projects, such as Stable Diffusion WebUI, ComfyUI, retrieval-augmented generation (RAG), and TensorRT. Using container images to deliver functions improves development and delivery efficiency.

Create a function

  1. Log on to the Function Compute console. In the navigation pane on the left, choose Function Management > Function List.

  2. In the top menu bar, select a region. On the Function List page, click Create Function.

  3. In the dialog box that appears, select GPU Function and then click Create GPU Function.

  4. On the Create GPU Function page, set the following parameters and then click Create.

    • Basic Configurations: Enter a Function Name. The name must be unique within the same Alibaba Cloud account and region, and must follow the naming conventions.

    • Elastic Configurations: Select an instance type. You cannot use provisioned instances and on-demand instances at the same time. After the function is created, you cannot change the instance type.

      • On-demand instances

        Configuration Item

        Description

        Example

        Instance Type

        Select On-demand Instance. Instances scale automatically based on request volume and are released when there are no requests. You are billed for what you use.

        On-demand Instance

        GPU Card Type

        Select a GPU card type. For more information about the specifications supported by different card types, see Instance types and specifications.

        Ada series

        Specifications

        Set the GPU Memory, vCPU, Memory, and Disk specifications for the function based on your business needs. After you set the specifications, the usage of each resource is calculated by multiplying the specification by the duration of use. For more information, see Billing overview.

        Note
        • All directories on the disk are writable. The disk space is shared.

        • The disk is tied to the instance lifetime of the underlying function. When the system reclaims the instance, data on the disk is lost. If you need persistent storage, you can mount a NAS file system or an OSS bucket. For more information, see Configure a NAS file system and Configure Object Storage Service.

        • GPU Memory: 48 GB

        • vCPU: 8 vCPU

        • Memory: 64 GB

        • Disk: 512 MB (not billed, Function Compute provides a free quota of 10 GB disk space)

        Minimum Instances

        If your business is latency-sensitive, after you select Elastic Instance, we recommend that you set the minimum number of instances to 1 or greater to lock resources in advance and reduce cold start latency.

        Note

        After you set Minimum Instances to 1 or more, if no elastic policy for the minimum number of instances is configured or if no elastic policy is active for a period, the current minimum number of instances is the value you set here.

        If multiple elastic policies are configured, the system calculates the Minimum Number Of Instances required when each policy is triggered. The system then uses the highest value among the active policies as the current Minimum Number Of Instances.

        For more information, see How is the current minimum number of instances calculated?.

        1

        Concurrency Per Instance

        You can configure multiple concurrent requests for a single GPU function instance. This means a single instance can process multiple requests simultaneously. For more information, see Configure concurrency per instance.

      • Provisioned instances

        Configuration Item

        Description

        Example

        Instance Type

        Select Provisioned Instance. Instances are allocated to the function from a pre-purchased provisioned resource pool.

        Provisioned instances are recommended for scenarios where predictable costs, low latency, and high resource utilization are important to ensure business stability.

        Provisioned Instance

        Provisioned Resource Pool

        A provisioned resource pool is a pool of provisioned instances that can be allocated to the target function. If your provisioned resource pool has insufficient capacity, click Scale-out in the Actions column and follow the on-screen instructions to expand it. For more information, see Provisioned resource pools (subscription).

        • Provisioned Resource Pool: fc-pool-****

        • GPU Card Type: Ada

        Specifications

        Set the GPU Memory, vCPU, Memory, and Disk specifications for the function based on your business needs. After you set the specifications, the usage of each resource is calculated by multiplying the specification by the duration of use. For more information, see Billing overview.

        Note
        • All directories on the disk are writable. The disk space is shared.

        • The disk is tied to the instance lifetime of the underlying function. When the system reclaims the instance, data on the disk is lost. If you need persistent storage, you can mount a NAS file system or an OSS bucket. For more information, see Configure a NAS file system and Configure Object Storage Service.

        GPU Memory: 48 GB

        vCPU: 8 vCPU

        Memory: 64 GB

        Disk: 512 MB (not billed, Function Compute provides a free quota of 10 GB disk space)

        Number Of Provisioned Instances

        Allocate a number of provisioned instances to the target function based on the resources available in the provisioned resource pool.

        1

        Concurrency Per Instance

        You can configure multiple concurrent requests for a single GPU function instance. This means a single instance can process multiple requests simultaneously. For more information, see Configure concurrency per instance.

        20

    • Function Code: Configure the function's runtime environment and code.

      Configuration Item

      Description

      Example

      Runtime Environment

      • Use Sample Image: Select a sample image provided by Function Compute to quickly deploy an image-based function. Select the target image from the image list under the Container Image configuration item.

      • Use Image from ACR: Under the Container Image configuration item, click Select Image From ACR. In the Select Container Image panel, select the created Container Registry instance and ACR image repository. Then, find the target image in the image area below and click Select in the Actions column. For more information, see Create a function that uses a custom image.

      Custom Image > Use Sample Image

      Container Image

      Select the target image.

      SpringBoot Web Application Sample Image

      Startup Command

      The startup command for the program. If you do not configure a startup command, the Entrypoint/CMD from the image is used by default.

      None

      Listener Port

      The port that the HTTP server in your code listens on.

      9000

      Execution Timeout

      Set the timeout period. The default Execution Timeout is 60 seconds, and the maximum is 86400 seconds.

      60

    • Instance Prefetch: In AI inference scenarios, you can configure instance prefetch to pre-warm the model. This eliminates the cold start latency for the first request.

      Configuration Item

      Description

      Example

      Instance Prefetch

      Instance Prefetch

      Configure an Initializer hook to pre-warm the instance and optimize cold starts. The hook runs a specified script or calls an interface to load the model after the function instance starts but before it processes requests.

      For more information about Initializer hooks, see Configure the instance lifecycle.

      Enabled

      Timeout

      Set the timeout period for the Initializer hook.

      60

      Prefetch Program Type

      You can configure two types of Initializer hooks to pre-warm the model: Execute Instruction and Invoke Code.

      Execute Instruction

      Instruction Content

      Configure the content of the instruction to execute. You can use custom shell implementations, such as /bin/bash, /bin/sh, /bin/csh, and /bin/zsh. Make sure the function's runtime environment supports the selected shell.

      See Callback method implementation

    • Permissions, Network, and Storage: Configure the function's access role, network settings, and storage mounts.

      Parameter

      Description

      Example

      Function Role

      The Function Compute platform uses this RAM role to generate temporary keys for accessing Alibaba Cloud resources and passes them to the code. For more information, see Use a function role to grant Function Compute permissions to access other Alibaba Cloud services.

      mytestrole

      Allow Access To VPC

      Enable this to allow the function to access resources in a VPC. For more information, see Configure network settings.

      Enabled

      VPC

      Required if you set Allow Access To VPC to Yes. Create a new VPC or select a VPC ID from the drop-down list.

      fc.auto.create.vpc.1632317****

      VSwitch

      Required if you set Allow Access To VPC to Yes. Create a new vSwitch or select a vSwitch ID from the drop-down list.

      fc.auto.create.vswitch.vpc-bp1p8248****

      Security Group

      Required if you set Allow Access To VPC to Yes. Create a new security group or select a security group from the drop-down list.

      fc.auto.create.SecurityGroup.vsw-bp15ftbbbbd****

      Allow Default NIC To Access Public Network

      Allow the function to access the public network through the default network interface card.

      Important

      When you use a static public IP address, you must disable Allow Default NIC To Access Public Network. Otherwise, the configured static public IP address does not take effect. For more information, see Configure a static public IP address.

      Enabled

      Mount NAS File System

      Mount a NAS file system to the function for persistent storage of shared data, such as models shared by multiple inference functions. For more information, see Configure a NAS file system.

      If you select automatic configuration, the system uses an existing General-purpose NAS file system named Alibaba-Fc-V3-Component-Generated. If a qualifying NAS file system does not exist in your account, the system creates one.

      Enabled

      Mount OSS Object Storage

      Mount an OSS bucket to the function for persistent storage of logs, business files, and other data. For more information, see Configure Object Storage Service (OSS).

      Enabled

    • Logs And Tracing Analysis

      Parameter

      Description

      Example

      Log Feature

      Persistently save the function's execution logs to Simple Log Service. This helps with code debugging, troubleshooting, and data analytics. For more information, see Configure the logging feature.

      • Automatic Configuration: Automatically selects a log project that starts with serverless-<region_id>.

        Only one such log project is created in each region. If the system finds that this log project already exists in the current region, it uses the existing project.

      • Custom Configuration: Manually specify the destination Log Project and Logstore.

      Enabled

    • More Configurations

      Parameter

      Description

      Example

      Time Zone

      Select the time zone for the function. This automatically adds the TZ environment variable to the function with the selected time zone as its value.

      UTC

      Tags

      Set tags for the function to group and manage functions. You must set both a tag key and a tag value.

      key : value

      Resource Group

      Select the resource group for the function. Use resource groups to manage your functions in groups.

      Default Resource Group

      Environment Variables

      Use environment variables to flexibly adjust the function's behavior without changing the code. For more information, see Configure environment variables.

      {
          "BUCKET_NAME": "MY_BUCKET",
          "TABLE_NAME": "MY_TABLE"
      }

Edit a function

After a function is created, you can change its image by editing the runtime on the Configuration tab of the function details page.

image

For information about other modifications, such as changing environment variables or log storage settings, see Configure a function.

Delete a function

Log on to the Function Compute console. On the Function List page, find the function you want to delete and click Delete in the Actions column. In the dialog box that appears, confirm that the function has no attached resources, such as triggers or elastic policies for minimum instances. Then, confirm the deletion.

Get the function ARN

An Alibaba Cloud Resource Name (ARN) is used to identify an Alibaba Cloud resource in your code. You can obtain the ARN of a function to reference it.

  1. Log on to the Function Compute console. In the navigation pane on the left, choose Function Management > Function List.

  2. In the top menu bar, select a region. Then, on the Function List page, click the name of the function.

  3. On the Function Details page, click Copy ARN on the right to obtain the ARN of the target function.

    image

References