Function Compute provides elastic instances and GPU-accelerated instances. This topic compares the two types of instances and describes the specifications, usage notes, and modes of the two types of instances.

Instance types

  • Elastic instance: the basic instance type of Function Compute. Elastic instances are suitable for scenarios in which burst traffic occurs and compute-intensive scenarios.
  • GPU-accelerated instances: Powered by the Turing architecture, GPU-accelerated instances are suitable for audio and video processing, AI, and image processing scenarios. Service loads in different scenarios are accelerated by using the GPU hardware to improve the efficiency of service processing.
    The following topics provide the best practices for using GPU-accelerated instances in different scenarios:
    Important
    • GPU-accelerated instances can be deployed only by using container images.
    • To ensure that your business runs properly, join the DingTalk group 11721331 to apply for permissions on GPU-accelerated instances. You must provide the following information:
      • The name of the organization, such as the name of your company.
      • The ID of your Alibaba Cloud account.
      • The region where you want to use GPU-accelerated instances, such as China (Shenzhen).
      • The contact information, such as your mobile number, email address, or DingTalk account.
      • The size of your image.

Instance specifications

  • Elastic instances

    The following table describes the specifications of elastic instances. You can select instance specifications based on your business requirements.

    vCPU Memory size (MB) Maximum code package size (GB) Maximum function execution duration (s) Maximum disk size (GB) Maximum bandwidth (Gbit/s)
    0.05~16

    The value must be a multiple of the 0.05.

    128~32768

    The value must be a multiple of 64.

    10 86400 10
    Valid values:
    • 512 MB. This is the default value.
    • 10 GB.
    5
    Note The ratio of vCPUs to memory size (in GB) must be set from 1:1 to 1:4.
  • GPU-accelerated instances

    The following table describes the specifications of GPU-accelerated instances. You can select instance specifications based on your business requirements.

    vCPU Memory size (GB) vGPU upper limit (computing power) GPU memory range (GB) Image size (GB) Maximum function execution duration (s) Maximum disk size (GB) Maximum bandwidth (Gbit/s)
    1 4 GPU/8 1~2

    Container Registry Enterprise Edition (Standard Edition): 10

    Container Registry Enterprise Edition (Advanced Edition): 10

    Container Registry Enterprise Edition (Basic Edition): 10

    Container Registry Personal Edition (free): 10

    86400 10 5
    2 8 GPU/4 3~4 86400 10 5
    4 16 GPU/2 5~8 86400 10 5
    8 32 GPU/1 9~16 86400 10 5
    Note
    • The value of GPU memory must be an integer. Unit: GB.
    • GPU-accelerated instances are supported in the following regions: China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Chengdu), Japan (Tokyo), and US (Virginia).

For more information about the message routing feature, see Limits.

Usage notes

If you want to reduce the cold start duration or improve resource utilization, you can use the following solutions.

  • Provisioned mode: the ideal solution to resolve the cold start issue. You can reserve a fixed number of instances based on your resource budget, reserve resources for a specified period of time based on business fluctuations, or select an auto scaling policy based on usage thresholds. The average cold start latency of instances is significantly reduced when the provisioned mode is used.
  • High concurrency for a single instance: the ideal solution to improve resource utilization of instances. We recommend that you configure high concurrency for instances based on the resource demands of your business. If you use this solution, the CPU and memory are preemptively shared when multiple tasks are executed on one instance at the same time. This way, resource utilization is improved.

Instance modes

GPU-accelerated instances and elastic instances support the following usage modes.

On-demand mode

In on-demand mode, Function Compute automatically allocates and releases instances for functions. In this mode, the billed execution duration starts from the time when a request is sent to execute a function and ends at a time when the request is completely executed. An on-demand instance can process one or more requests at a time. For more information, see Specify the instance concurrency.

  • Execution duration of functions by a single instance that processes a single request at a time
    When an on-demand instance processes a single request, the billed execution duration starts from the time when the request arrives at the instance to the time when the request is completely executed. instanceconcurrency=1
  • Execution duration of functions by a single instance that concurrently processes multiple requests at a time

    If you use an on-demand instance to concurrently process multiple requests, the billed execution duration starts from the time when the first request arrives at the instance to the time when the last request is completely executed. You can reuse resources to concurrently process multiple requests. This way, resource costs can be reduced.

    instanceconcurrency>1

Provisioned mode

In provisioned mode, function instances are allocated, released, and managed by yourself. For more information, see Configure provisioned instances and auto scaling rules. In provisioned mode, the billed execution duration starts from the time when Function Compute starts a provisioned instance and ends when you release the instance. You are charged for the instance until you release the instance, regardless of whether the provisioned instance executes requests. On-Demand Resources

Additional information