All Products
Search
Document Center

Function Compute:Instance types and usage modes

Last Updated:Feb 02, 2024

Function Compute provides elastic instances and GPU-accelerated instances. This topic describes the types, specifications, usage notes, and usage modes of the instances.

Instance types

  • Elastic instance: the basic instance type of Function Compute. Elastic instances are suitable for scenarios with bursty traffic or compute-intensive workloads.

  • GPU-accelerated instance: uses the Ampere and Turing architectures for GPU acceleration. In most cases, GPU-accelerated instances are used in scenarios such as audio and video processing, AI, and image processing. GPU-accelerated instances accelerate business services by offloading the services to GPU hardware. This improves the efficiency of service processing.

    Important
    • GPU-accelerated instances can be deployed only by using container images.

    • For optimal user experience, join the DingTalk group 11721331 and provide the following information:

      • The organization name, such as the name of your company.

      • The ID of your Alibaba Cloud account.

      • The region in which you want to use GPU-accelerated instances. Example: China (Shenzhen).

      • The contact information, such as your mobile number, email address, or DingTalk account.

Instance specifications

  • Elastic instances

    The following table describes the specifications of elastic instances. You can select instance specifications based on your business requirements.

    vCPU (core)

    Memory size (MB)

    Maximum code package size (GB)

    Maximum function execution duration (seconds)

    Maximum disk size (GB)

    Maximum bandwidth (Gbit/s)

    0.05~16

    Note: The value must be a multiple of 0.05.

    128~32768

    Note: The value must be a multiple of 64.

    10

    86400

    10

    Valid values:

    • 512 MB. This is the default value.

    • 10 GB.

    5

    Note

    The ratio of vCPU capacity to memory capacity (in GB) ranges from 1:1 to 1:4.

  • GPU-accelerated instances

    The following table describes the specifications of GPU-accelerated instances. You can select instance specifications based on your business requirements.

    Instance specification

    Card type

    vGPU memory (MB)

    vGPU computing power (card)

    vCPU (core)

    Memory size (MB)

    fc.gpu.tesla.1

    Tesla T4

    Valid values: 1024 to 16384 (1 GB to 16 GB).

    Note: The value must be a multiple of 1024.

    The value is calculated based on the following formula: vGPU computing power = vGPU memory (GB)/16. For example, if you set the vGPU memory to 5 GB, the maximum vGPU computing power is 5/16.

    Note: Computing resources are automatically allocated by Function Compute. You do not need to manually allocate computing resources.

    Valid values: 0.05 to the value of [vGPU memory (GB)/2].

    Note: The value must be a multiple of 0.05. For more information, see the GPU specifications section of this topic.

    Valid values: 128 to the value of [vGPU memory (GB) x 2048].

    Note: The value must be a multiple of 64. For more information, see the GPU specifications section of this topic.

    fc.gpu.ampere.1

    Ampere A10

    Valid values: 1024 to 24576 (1 GB to 24 GB).

    Note: The value must be a multiple of 1024.

    The value is calculated based on the following formula: vGPU computing power = vGPU memory (GB)/24. For example, if you set the vGPU memory to 5 GB, the maximum vGPU computing power is 5/24.

    Note: Computing resources are automatically allocated by Function Compute. You do not need to manually allocate computing resources.

    Valid values: 0.05 to the value of [vGPU memory (GB)/3].

    Note: The value must be a multiple of 0.05. For more information, see the GPU specifications section of this topic.

    Valid values: 128 to the value of [(vGPU memory (GB) x 4096)/3].

    Note: The value must be a multiple of 64. For more information, see the GPU specifications section of this topic.

    The GPU-accelerated instances of Function Compute also support the following resource specifications.

    Image size (GB)

    Maximum function execution duration (seconds)

    Maximum disk size (GB)

    Maximum bandwidth (Gbit/s)

    Container Registry Enterprise Edition (Standard Edition): 10

    Container Registry Enterprise Edition (Advanced Edition): 10

    Container Registry Enterprise Edition (Basic Edition): 10

    Container Registry Personal Edition (free of charge): 10

    86400

    10

    5

    Note
    • Setting the instance type to g1 achieves the same effect as setting the instance type to fc.gpu.tesla.1.

    • GPU-accelerated instances of the T4 type are supported in the following regions: China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), Japan (Tokyo), and US (Virginia).

    • GPU-accelerated instances of the A10 type are supported in the following regions: China (Hangzhou), China (Shanghai), and Japan (Tokyo).

GPU specifications

Show the details of the fc.gpu.tesla.1 instance specification.

vGPU memory (MB)

vCPU (core)

Maximum memory size (GB)

Memory size (MB)

1024

0.05~0.5

2

128~2048

2048

0.05~1

4

128~4096

3072

0.05~1.5

6

128~6144

4096

0.05~2

8

128~8192

5120

0.05~2.5

10

128~10240

6144

0.05~3

12

128~12288

7168

0.05~3.5

14

128~14336

8192

0.05~4

16

128~16384

9216

0.05~4.5

18

128~18432

10240

0.05~5

20

128~20480

11264

0.05~5.5

22

128~22528

12288

0.05~6

24

128~24576

13312

0.05~6.5

26

128~26624

14336

0.05~7

28

128~28672

15360

0.05~7.5

30

128~30720

16384

0.05~8

32

128~32768

Show the details of the fc.gpu.ampere.1 instance specification.

vGPU memory (MB)

vCPU (core)

Maximum memory size (GB)

Memory size (MB)

1024

0.05~0.3

1.3125

128~1344

2048

0.05~0.65

2.625

128~2688

3072

0.05~1

4

128~4096

4096

0.05~1.3

5.3125

128~5440

5120

0.05~1.65

6.625

128~6784

6144

0.05~2

8

128~8192

7168

0.05~2.3

9.3125

128~9536

8192

0.05~2.65

10.625

128~10880

9216

0.05~3

12

128~12288

10240

0.05~3.3

13.3125

128~13632

11264

0.05~3.65

14.625

128~14976

12288

0.05~4

16

128~16384

13312

0.05~4.3

17.3125

128~17728

14336

0.05~4.65

18.625

128~19072

15360

0.05~5

20

128~20480

16384

0.05~5.3

21.3125

128~21824

17408

0.05~5.65

22.625

128~23168

18432

0.05~6

24

128~24576

19456

0.05~6.3

25.3125

128~25920

20480

0.05~6.65

26.625

128~27264

21504

0.05~7

28

128~28672

22528

0.05~7.3

29.3125

128~30016

23552

0.05~7.65

30.625

128~31360

24576

0.05~8

32

128~32768

Usage notes

If you want to reduce the cold start duration or improve resource utilization, you can use the following solutions:

  • Provisioned mode: the ideal solution to resolve the cold start issue. In this mode, you can reserve a fixed number of instances based on your resource budget, reserve resources for a specific period of time based on business fluctuation, or select an auto scaling policy based on usage thresholds. After the provisioned mode is used, the average cold start latency of instances is reduced.

  • High concurrency for a single instance: the ideal solution to improve resource utilization of instances. We recommend that you configure a high concurrency for instances based on the resource demands of your business. If you use this solution, the CPU and memory are preemptively shared when multiple tasks are executed on one instance at the same time. This way, resource utilization is improved.

Usage modes

Elastic instances and GPU-accelerated instances support two usage modes: on-demand mode and provisioned mode.

On-demand mode

In on-demand mode, Function Compute automatically allocates and releases instances for functions. In this mode, the execution duration is billed from the time when a request is sent to execute a function to the time when the request is completely executed. An on-demand instance can process one or more requests at a time. For more information, see Configure instance concurrency.

  • Execution duration of functions by a single instance that processes a single request at a time

    When an on-demand instance processes a single request, the billed execution duration starts from the time when the request arrives at the instance and ends when the request is completely executed.

    image
  • Execution duration of functions by a single instance that concurrently processes multiple requests at a time

    If you use an on-demand instance to concurrently process multiple requests, the execution duration is billed from the time when the first request arrives at the instance to the time when the last request is completely executed. You can reuse resources to concurrently process multiple requests. This reduces resource costs.

    image

Provisioned mode

In provisioned mode, you can manually allocate, release, and manage function instances based on your business requirements. For more information, see Configure provisioned instances and auto scaling rules. In provisioned mode, the execution duration is billed from the time when Function Compute starts a provisioned instance to the time when you release the provisioned instance. You are charged for the instance until you release the instance, regardless of whether the provisioned instance executes requests.

image

References