All Products
Search
Document Center

Function Compute (2.0):Invoke GPU functions based on asynchronous tasks

Last Updated:Dec 20, 2023

This topic describes how to use Serverless Devs to invoke a GPU function based on asynchronous tasks and pass the invocation results to the configured asynchronous destination functions.

Background

GPU-accelerated Instance

With the widespread application of machine learning, especially in-depth learning, CPUs are incapable of meeting the computing power requirements generated by a large number of vector, matrix, and tensor operations. The computing power requirements include the requirements for high-precision calculations in training scenarios and the requirements for low-precision calculations in reasoning scenarios. In 2007, Nvidia launched the Compute Unified Device Architecture (CUDA) framework, a programmable general-purpose computing platform. Researchers and developers revised numerous algorithms to improve performance by dozens or even thousands of times. GPU has become one of the basic facilities of various tools, algorithms, and frameworks since machine learning got popular.

During Apsara Conference 2021, Alibaba Cloud Function Compute officially launched GPU-accelerated instances that use the Turing architecture. Serverless developers can use GPU hardware to accelerate AI training and inference tasks. This way, the efficiency of model training and inference services is improved.

Asynchronous tasks

Function Compute provides full-stack capabilities that can be used to distribute, execute, and monitor asynchronous tasks. This allows you to focus on the compilation of task processing logic, and need to only create and submit the task processing functions. Function Compute provides various monitoring features such as asynchronous task logs, metrics, and duration statistics in each phase. Function Compute also provides features such as auto scaling of instances, task deduplication, termination of specified tasks, and batch task suspension, resumption, and deletion. For more information, see Overview.

Scenarios

In non-real-time and offline AI inference scenarios, AI training scenarios, and audio and video production scenarios, GPU functions are invoked based on asynchronous tasks. This allows developers to focus on businesses and quickly achieve business goals. The following section describes the implementation methods:

  • GPU resources can be used in 1/8, 1/4, 1/2 or exclusive mode by using the GPU virtualization technology. This way, GPU-accelerated instances can be configured in a fine-grained manner.

  • Various mature asynchronous task processing capabilities, such as asynchronous mode management, task deduplication, task monitoring, task retry, event triggering, result callback, and task orchestration, are provided.

  • Developers can focus on code development and the achievement of business objectives without the need to perform O&M on GPU clusters, such as driver and CUDA version management, machine operation management, and GPU bad card management.

How it works

This topic describes how to deploy a GPU function and implement result callbacks. In this topic, the tgpu_basic_func GPU function is deployed, the async-callback-succ-func function is specified as the callback function for successful invocations, and the async-callback-fail-func function is configured as the callback function for failed invocations. The following table lists the information about the preceding functions.

Function

Description

Runtime environment

Instance type

Trigger type

tgpu_basic_func

A function that runs AI quasi-real-time tasks and AI offline tasks based on GPU-accelerated instances of Function Compute

Custom Container

GPU-accelerated instance

HTTP function

async-callback-succ-func

The destination callback function for successful task executions

Python 3

Elastic instance

Event function

async-callback-fail-func

The destination callback function for failed task executions

Python 3

Elastic instance

Event function

The following figure describes the workflow.

image

Before you begin

Step 1: Deploy the callback function for successful invocations

  1. Initialize a project

    s init devsapp/start-fc-event-python3 -d async-succ-callback

    The following sample code shows the directory of the created project:

    ├── async-succ-callback
    │   ├── code
    │   │   └── index.py
    │   └── s.yaml
  2. Go to the directory where the project resides.

    cd async-succ-callback
  3. Modify the parameter configurations in the directory file based on your business requirements.

    • Edit the s.yaml file. Example:

      edition: 1.0.0
      name: hello-world-app
      # access specifies the key information required by the current application.
      # For information about how to configure keys, visit https://www.serverless-devs.com/serverless-devs/command/config.
      # For more information about how to use keys, visit https://www.serverless-devs.com/serverless-devs/tool.
      access: "default"
      
      vars: # The global variable.
        region: "cn-shenzhen"
      
      services:
        helloworld: # The name of the service or module.
          component: fc
          props:
            region: ${vars.region}
            service:
              name: "async-callback-service"
              description: 'async callback service'
              # Obtain the logConfig configuration document from https://gitee.com/devsapp/fc/blob/main/docs/zh/yaml/service.md#logconfig.
              logConfig:
                project: tgpu-prj-sh             # The project that stores the request logs. You must create the project in Simple Log Service in advance. We recommend that you configure this item.
                logstore: tgpu-logstore-sh       # The Logstore that stores the request logs. You must create the Logstore in Simple Log Service in advance. We recommend that you configure this item.
                enableRequestMetrics: true
                enableInstanceMetrics: true
                logBeginRule: DefaultRegex
            function:
              name: "async-callback-succ-func"
              description: 'async callback succ func'
              runtime: python3
              codeUri: ./code
              handler: index.handler
              memorySize: 128
              timeout: 60
    • Edit the index.py file. Example:

      # -*- coding: utf-8 -*-
      import logging
      
      # To enable the initializer feature
      # please implement the initializer function as below:
      # def initializer(context):
      #   logger = logging.getLogger()
      #   logger.info('initializing')
      
      def handler(event, context):
        logger = logging.getLogger()
        logger.info('hello async callback succ')
        return 'hello async callback succ'
  4. Deploy the code to Function Compute.

    s deploy

    You can view the deployed function in the Function Compute console.

  5. Invoke and debug the function by using an on-premises machine.

    s invoke

    After the invocation is complete, hello async callback succ is returned.

Step 2: Deploy the callback function for failed invocations

  1. Initialize a project

    s init devsapp/start-fc-event-python3 -d async-fail-callback

    The following sample code shows the directory of the created project:

    ├── async-fail-callback
    │   ├── code
    │   │   └── index.py
    │   └── s.yaml
  2. Go to the directory where the project resides.

    cd async-fail-callback
  3. Modify the parameter configurations in the directory file based on your business requirements.

    • Edit the s.yaml file. Example:

      edition: 1.0.0
      name: hello-world-app
      # access specifies the key information required by the current application.
      # For information about how to configure keys, visit https://www.serverless-devs.com/serverless-devs/command/config.
      # For more information about how to use keys, visit https://www.serverless-devs.com/serverless-devs/tool.
      access: "default"
      
      vars: # The global variable.
        region: "cn-shenzhen"
      
      services:
        helloworld: # The name of the service or module.
          component: fc
          props:
            region: ${vars.region}
            service:
              name: "async-callback-service"
              description: 'async callback service'
              # Obtain the logConfig configuration document from https://gitee.com/devsapp/fc/blob/main/docs/zh/yaml/service.md#logconfig.
              logConfig:
                project: tgpu-prj-sh             # The project that stores the request logs. You must create the project in Simple Log Service in advance. We recommend that you configure this item.
                logstore: tgpu-logstore-sh       # The Logstore that stores the request logs. You must create the Logstore in Simple Log Service in advance. We recommend that you configure this item.
                enableRequestMetrics: true
                enableInstanceMetrics: true
                logBeginRule: DefaultRegex
            function:
              name: "async-callback-fail-func"
              description: 'async callback fail func'
              runtime: python3
              codeUri: ./code
              handler: index.handler
              memorySize: 128
              timeout: 60
    • Edit the index.py file. Example:

      # -*- coding: utf-8 -*-
      import logging
      
      # To enable the initializer feature
      # please implement the initializer function as below:
      # def initializer(context):
      #   logger = logging.getLogger()
      #   logger.info('initializing')
      
      def handler(event, context):
        logger = logging.getLogger()
        logger.info('hello async callback fail')
        return 'hello async callback fail'
  4. Deploy the code to Function Compute.

    s deploy

    You can view the deployed function in the Function Compute console.

  5. Invoke and debug the function by using an on-premises machine.

    s invoke

    After the invocation is complete, hello async callback fail is returned.

Step 3: Deploy a GPU function

  1. Create a project directory.

    mkdir fc-gpu-async-job&&cd fc-gpu-async-job
  2. Create a file based on the following directory structure. Use the actual configurations of the parameters when you create the file.

    Directory structure:

    ├── fc-gpu-async-job
    ├── code
    │   ├── app.py
    │   └── Dockerfile
    └── s.yaml
    • Edit the s.yaml file. Example:

      edition: 1.0.0
      name: gpu-container-demo
      # access specifies the key information required by the current application.
      # For information about how to configure keys, visit https://www.serverless-devs.com/serverless-devs/command/config.
      # For information about the order in which keys are used, visit https://www.serverless-devs.com/serverless-devs/tool.
      access: default
      vars:
        region: cn-shenzhen
      services:
        customContainer-demo:
          component: devsapp/fc
          props:
            region: ${vars.region}
            service:
              name: tgpu_basic_service
              internetAccess: true
              # Obtain the logConfig configuration document from https://gitee.com/devsapp/fc/blob/main/docs/zh/yaml/service.md#logconfig.
              logConfig:
                project: aliyun****          # The project that stores the request logs. You must create the project in Simple Log Service in advance. We recommend that you configure this item.
                logstore: func****     # The Logstore that stores the request logs. You must create the Logstore in Simple Log Service in advance. We recommend that you configure this item.
                enableRequestMetrics: true
                enableInstanceMetrics: true
                logBeginRule: DefaultRegex
            function:
              name: tgpu_basic_func
              description: test gpu basic
              handler: not-used
              timeout: 600
              caPort: 9000
              # You can select an appropriate GPU-accelerated instance type based on the actual GPU memory usage. The following example shows the 1/8 virtualized GPU specification:
              instanceType: fc.gpu.tesla.1
              gpuMemorySize: 2048
              cpu: 1
              memorySize: 4096
              diskSize: 512
              instanceConcurrency: 1
              runtime: custom-container
              customContainerConfig:
                # Specify the information about your image. You must create a Container Registry Personal Edition or Enterprise Edition instance in advance. You must also create a namespace and an image repository.
                image: registry.cn-shenzhen.aliyuncs.com/my****/my****
                # Enable image acceleration. This feature can optimize the cold start of gigabyte-level images.
                accelerationType: Default
              codeUri: ./code
              # Asynchronous mode configurations
              #For more information, see https://gitee.com/devsapp/fc/blob/main/docs/zh/yaml/function.md#asyncconfiguration.
              asyncConfiguration:
                destination:           
                  # Specify the Alibaba Cloud Resource Name (ARN) of the callback function for failed invocations.
                  onFailure: "acs:fc:cn-shenzhen:164901546557****:services/async-callback-service.LATEST/functions/async-callback-fail-func"
                  # Specify the ARN of the callback function for successful invocations.
                  onSuccess: "acs:fc:cn-shenzhen:164901546557****:services/async-callback-service.LATEST/functions/async-callback-succ-func"
                statefulInvocation: true
            triggers:
              - name: httpTrigger
                type: http
                config:
                  authType: anonymous
                  methods:
                    - GET
    • Edit the Dockerfile file. Example:

      FROM nvidia/cuda:11.0-base
      FROM ubuntu
      WORKDIR /usr/src/app
      RUN apt-get update
      RUN apt-get install -y python3
      COPY . .
      CMD [ "python3", "-u", "/usr/src/app/app.py" ]
      EXPOSE 9000
    • Edit the app.py file. Example:

      # -*- coding: utf-8 -*-
      # python2 and python3
      
      from __future__ import print_function
      from http.server import HTTPServer, BaseHTTPRequestHandler
      import json
      import sys
      import logging
      import os
      import time
      
      host = ('0.0.0.0', 9000)
      
      class Resquest(BaseHTTPRequestHandler):
          def do_GET(self):
              print("simulate long execution scenario, sleep 10 seconds")
              time.sleep(10)
      
              print("show me GPU info")
              msg = os.popen("nvidia-smi -L").read()
              data = {'result': msg}
              self.send_response(200)
              self.send_header('Content-type', 'application/json')
              self.end_headers()
              self.wfile.write(json.dumps(data).encode())
      
      if __name__ == '__main__':
          server = HTTPServer(host, Resquest)
          print("Starting server, listen at: %s:%s" % host)
          server.serve_forever()
  3. Deploy the code to Function Compute.

    s deploy

    You can view the deployed GPU function and the asynchronous configuration of the function in the Function Compute console.dg-gpu-async-result

  4. Invoke and debug the function by using an on-premises machine.

    s invoke

    After the invocation is complete, Hello, World! is returned.

  5. Submit the asynchronous task.

    1. View the preparation status of image acceleration for the GPU function.

      We recommend that you initiate an asynchronous task after the status of image acceleration changes to Available. Otherwise, exceptions such as link timeout may occur. dg-gpu-iamge-state

    2. Log on to the Function Compute console. Find the GPU function tgpu_basic_func. On the Asynchronous Tasks tab, click Submit Task.

    After the execution is complete, the task status changes to Successful.

    You can find the configured callback function async-callback-succ-func for successful invocations. Choose Logs > Call Request List, and find the result line of the asynchronous request to check whether the invocation is successful. dg-gpu-success-result

Additional information

For more information about the best practices of GPU functions, see Use cases for serverless GPU applications.