You can use Function Compute, Object Storage Service (OSS), and FFmpeg to build an elastic and highly available audio and video processing system. This topic describes how to process audio and video files and query the information about the audio and video files such as the metadata and the duration of the audio or video stream by using Serverless Devs. In the examples of this topic, Python is used. You can modify the sample code to perform custom development based on your business requirements.

Background information

FFmpeg is an open source software project that consists of a large suite of programs used to record audio and videos, convert audio and video formats, and stream audio and video files. FFmpeg is licensed under the GNU Lesser General Public License (LGPL) or GNU General Public License (GPL). It provides a comprehensive solution to record, convert, and stream audio and videos. FFmpeg uses the advanced audio and video codec library libavcodec to ensure high portability and encoding and decoding quality. For more information, see FFmpeg.
The following table describes the functions for FFmpeg-based audio and video processing.
Function Description
GetMediaMeta Queries the metadata of an audio or video file.
GetDuration Queries the duration of an audio or video file.
GetSprites Creates an image sprite for a video.
VideoWatermark Adds watermarks or animated GIFs to a video.
AudioConvert Converts the format of an audio or video file.
VideoGif Converts a video file to a GIF file.

Prerequisites

Deploy an application by using Serverless Devs

  1. Run the following command to initialize a project:
    s init devsapp/ffmpeg-app -d ffmpeg-app

    -d: specifies the name of the generated directory.

  2. Run the following command to go to the project directory:
    cd ffmpeg-app
  3. Optional: Modify the sample code in the project directory based on your business requirements.
  4. Run the following command to deploy the project:
    s deploy -y
    Note If you need to deploy a function in the project, such as the GetMediaMeta function that is used to query the metadata of an audio or video file, run the following command:
    s GetMediaMeta deploy

    If you need to deploy other functions, replace GetMediaMeta with other functions.

    Sample output:
    [2021-11-25T17:35:56.524] [INFO ] [S-CLI] - Start ...
    [2021-11-25T17:35:56.529] [INFO ] [S-CLI] - It is detected that your project has the following projects < AudioConvert,GetMediaMeta,GetDuration,VideoGif,GetSprites,VideoWatermark > to be execute
    [2021-11-25T17:35:56.530] [INFO ] [S-CLI] - Start executing project AudioConvert
    [2021-11-25T17:35:57.725] [INFO ] [FC-DEPLOY] - Using region: cn-qingdao
    [2021-11-25T17:35:57.725] [INFO ] [FC-DEPLOY] - Using access alias: default
    [2021-11-25T17:35:57.726] [INFO ] [FC-DEPLOY] - Using accessKeyID: LTAI4G4cwJkK4Rza6xd9****
    [2021-11-25T17:35:57.726] [INFO ] [FC-DEPLOY] - Using accessKeySecret: eCc0GxSpzfq1DVspnqqd6nmYNN****
     Using fc deploy type: sdk, If you want to deploy with pulumi, you can [s cli fc-default set deploy-type pulumi] to switch.
     ......
    
    There is auto config in the service: FcOssFFmpeg
    
    ......
    AudioConvert:
      region:   cn-qingdao
      service:
        name: FcOssFFmpeg
      function:
        name:       AudioConvert
        runtime:    python3
        handler:    index.handler
        memorySize: 256
        timeout:    600
    GetMediaMeta:
      region:   cn-qingdao
      service:
        name: FcOssFFmpeg
      function:
        name:       GetMediaMeta
        runtime:    python3
        handler:    index.handler
        memorySize: 1024
        timeout:    600
    GetDuration:
      region:   cn-qingdao
      service:
        name: FcOssFFmpeg
      function:
        name:       GetDuration
        runtime:    python3
        handler:    index.handler
        memorySize: 256
        timeout:    600
    VideoGif:
      region:   cn-qingdao
      service:
        name: FcOssFFmpeg
      function:
        name:       VideoGif
        runtime:    python3
        handler:    index.handler
        memorySize: 512
        timeout:    600
    GetSprites:
      region:   cn-qingdao
      service:
        name: FcOssFFmpeg
      function:
        name:       GetSprites
        runtime:    python3
        handler:    index.handler
        memorySize: 512
        timeout:    600
    VideoWatermark:
      region:   cn-qingdao
      service:
        name: FcOssFFmpeg
      function:
        name:       VideoWatermark
        runtime:    python3
        handler:    index.handler
        memorySize: 256
        timeout:    600
  5. Debug the sample functions.

    Run the following command to debug the GetMediaMeta function:

    s GetMediaMeta invoke -e '{"bucket_name": "test-bucket","object_key": "a.mp4"}'

    Parameters:

    • bucket_name: the name of the OSS bucket that stores the audio or video file.
    • object_key: the name of the audio or video file whose metadata you want to query.

    Sample output:

    [2021-11-26T14:19:02.045] [INFO ] [S-CLI] - Start ...
    ========= FC invoke Logs begin =========
    FunctionCompute python3 runtime inited.
    FC Invoke Start RequestId: dda964e0-82b6-452a-b849-6b0b835f****
    2021-11-26T06:19:04.688Z dda964e0-82b6-452a-b849-6b0b835f**** [INFO] current Function [handler] excute time is 0.23 seconds
    FC Invoke End RequestId: dda964e0-82b6-452a-b849-6b0b835f****
    
    Duration: 1238.78 ms, Billed Duration: 1239 ms, Memory Size: 1024 MB, Max Memory Used: 118.39 MB
    ========= FC invoke Logs end =========
    
    FC Invoke Result:
    {
        "format": {
            "bit_rate": "17024829",
            "duration": "110.037333",
            "filename": "http://test-bucket.oss-cn-qingdao-internal.aliyuncs.com/a.mp4......",
            "format_long_name": "QuickTime / MOV",
            "format_name": "mov,mp4,m4a,3gp,3g2,mj2",
            "nb_programs": 0,
            "nb_streams": 2,
            "probe_score": 100,
            "size": "234170850",
            "start_time": "0.000000",
            "tags": {
                "compatible_brands": "mp42mp41",
                "creation_time": "2020-09-05T06:03:49.000000Z",
                "major_brand": "mp42",
                "minor_version": "0"
            }
        },
        "streams": [
            {
                "avg_frame_rate": "25/1",
                "bit_rate": "16708594",
                "bits_per_raw_sample": "8",
                "chroma_location": "left",
                "codec_long_name": "H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10",
                "codec_name": "h264",
                "codec_tag": "0x31637661",
                "codec_tag_string": "avc1",
                "codec_time_base": "1/50",
                "codec_type": "video",
                "coded_height": 1088,
                "coded_width": 1920,
                "color_primaries": "bt709",
                "color_range": "tv",
                "color_space": "bt709",
                "color_transfer": "bt709",
                "disposition": {
                    "attached_pic": 0,
                    "clean_effects": 0,
                    "comment": 0,
                    "default": 1,
                    "dub": 0,
                    "forced": 0,
                    "hearing_impaired": 0,
                    "karaoke": 0,
                    "lyrics": 0,
                    "original": 0,
                    "timed_thumbnails": 0,
                    "visual_impaired": 0
                },
                "duration": "110.000000",
                "duration_ts": 2750000,
                "has_b_frames": 1,
                "height": 1080,
                "index": 0,
                "is_avc": "true",
                "level": 41,
                "nal_length_size": "4",
                "nb_frames": "2750",
                "pix_fmt": "yuv420p",
                "profile": "Main",
                "r_frame_rate": "25/1",
                "refs": 1,
                "start_pts": 0,
                "start_time": "0.000000",
                "tags": {
                    "creation_time": "2020-09-05T06:03:49.000000Z",
                    "encoder": "AVC Coding",
                    "handler_name": "\u001fMainconcept Video Media Handler",
                    "language": "eng"
                },
                "time_base": "1/25000",
                "width": 1920
            },
            {
                "avg_frame_rate": "0/0",
                "bit_rate": "317375",
                "bits_per_sample": 0,
                "channel_layout": "stereo",
                "channels": 2,
                "codec_long_name": "AAC (Advanced Audio Coding)",
                "codec_name": "aac",
                "codec_tag": "0x6134706d",
                "codec_tag_string": "mp4a",
                "codec_time_base": "1/48000",
                "codec_type": "audio",
                "disposition": {
                    "attached_pic": 0,
                    "clean_effects": 0,
                    "comment": 0,
                    "default": 1,
                    "dub": 0,
                    "forced": 0,
                    "hearing_impaired": 0,
                    "karaoke": 0,
                    "lyrics": 0,
                    "original": 0,
                    "timed_thumbnails": 0,
                    "visual_impaired": 0
                },
                "duration": "110.000000",
                "duration_ts": 5280000,
                "index": 1,
                "max_bit_rate": "417750",
                "nb_frames": "5158",
                "profile": "LC",
                "r_frame_rate": "0/0",
                "sample_fmt": "fltp",
                "sample_rate": "48000",
                "start_pts": 0,
                "start_time": "0.000000",
                "tags": {
                    "creation_time": "2020-09-05T06:03:49.000000Z",
                    "handler_name": "#Mainconcept MP4 Sound Media Handler",
                    "language": "eng"
                },
                "time_base": "1/48000"
            }
        ]
    }
    End of method: invoke
    

    Run the following command to debug the GetDuration function:

    s GetDuration invoke -e '{"bucket_name": "bucket-name","object_key": "a.mp4"}'

    Parameters:

    • bucket_name: the name of the OSS bucket that stores the audio or video file.
    • object_key: the name of the audio or video file whose duration you want to query.
    Sample output:
    [2021-11-26T14:21:48.877] [INFO ] [S-CLI] - Start ...
    ========= FC invoke Logs begin =========
    FunctionCompute python3 runtime inited.
    FC Invoke Start RequestId: 6bb9ecae-7f53-4efb-afea-7614ef87****
    2021-11-26T06:21:50.273Z 6bb9ecae-7f53-4efb-afea-7614ef87**** [INFO] current Function [handler] excute time is 0.17 seconds
    FC Invoke End RequestId: 6bb9ecae-7f53-4efb-afea-7614ef87****
    
    Duration: 754.63 ms, Billed Duration: 755 ms, Memory Size: 256 MB, Max Memory Used: 61.21 MB
    ========= FC invoke Logs end =========
    
    FC Invoke Result:
    110.037333
    
    End of method: invoke

    Run the following command to debug the GetSprites function:

    s GetSprites invoke -e '{"bucket_name": "test-bucket","object_key": "aclear.mp4", "output_dir" : "output/", "tile": "3*4"}'

    Parameters:

    • bucket_name: the name of the OSS bucket that stores the video file.
    • object_key: the name of the video file for which you want to create an image sprite.
    • output_dir: the name of the OSS bucket that is used to store the image sprite.
    • tile: the rows and columns of the image sprite.
    • start: optional. The specified start point of the video from which the image sprite is to be created. Default value: 0.
    • duration: optional. The duration of the video clip for creating the image sprite after the start point specified by the start parameter. For example, if you set the start parameter to 10 and the duration parameter to 20, the snapshots are captured from the 10th to 30th seconds of the video.
    • itsoffset: optional. The latency of the video stream that is displayed. Default value: 0. This parameter must be used together with the start and interval parameters. Examples:
      • If you set the start parameter to 0, the interval parameter to 10, and the itsoffset parameter to 0, the snapshots are captured at the 5th, 15th, and 25th seconds of the video.
      • If you set the start parameter to 0, the interval parameter to 10, and the itsoffset parameter to 1, the snapshots are captured at the 4th, 14th, and 24th seconds of the video.
      • If you set the start parameter to 0, the interval parameter to 10, and the itsoffset parameter to -1, the snapshots are captured at the 6th, 16th, and 26th seconds of the video.
      • If you set the start parameter to 0, the interval parameter to 10, and the itsoffset parameter to 4.999, the snapshots are captured at the 0th, 10th, and 20th seconds of the video.
       
      Note: If you set the itsoffset parameter to 5, the snapshot captured at the 0th second is lost. We recommend that you set this parameter to 4.999.
       
    • scale: optional. The size of the captured snapshot. By default, the aspect ratio of the captured snapshot is -1:-1. This parameter is optional.
    • interval: optional. The interval at which snapshots are captured from the video. Unit: seconds. Default value: 1.
    • padding: optional. The distance between the snapshots. Default value: 0.
    • color: optional. The background color of the image sprite. By default, the background color is black.
    • dst_type: optional. The format of the image sprite. Default value: JPG. Valid values: JPG and PNG.
    Sample output:
    [2021-11-26T16:07:42.585] [INFO ] [S-CLI] - Start ...
    ========= FC invoke Logs begin =========
    FunctionCompute python3 runtime inited.
    FC Invoke Start RequestId: 1b427831-e10f-4c2b-b780-9b504c29aa67
    2021-11-26T08:07:44.684Z 1b427831-e10f-4c2b-b780-9b504c29aa67 [INFO] b'{"bucket_name": "test-bucket","object_key": "a.mp4", "output_dir" : "output/", "dst_type":".wav"}'
    2021-11-26T08:07:51.642Z 1b427831-e10f-4c2b-b780-9b504c29aa67 [INFO] Uploaded /tmp/transcoded_a.wav to output/transcoded_a.wav
    2021-11-26T08:07:51.642Z 1b427831-e10f-4c2b-b780-9b504c29aa67 [INFO] current Function [handler] excute time is 6.96 seconds
    FC Invoke End RequestId: 1b427831-e10f-4c2b-b780-9b504c29aa67
    
    Duration: 7876.26 ms, Billed Duration: 7877 ms, Memory Size: 3072 MB, Max Memory Used: 119.06 MB
    ========= FC invoke Logs end =========
    
    FC Invoke Result:
    ok
    
    
    End of method: invoke

    Run the following command to debug the VideoWatermark function:

    s VideoWatermark invoke -e '{"bucket_name": "test-bucket","object_key": "a.mp4", "output_dir" : "output/", "vf_args" : "drawtext=fontfile=/usr/share/fonts/truetype/wqy/wqy-zenhei.ttc:text='hello Function Compute':x=100:y=50:fontsize=24:fontcolor=red"}'

    Parameters:

    • bucket_name: the name of the OSS bucket that stores the video file.
    • object_key: the name of the video file to which you want to add watermarks or animated GIFs.
    • output_dir: the name of the OSS bucket that is used to store the processed video file.
    • vf_args: the text or image watermarks to be added to the video.
    • filter_complex_args: the animated GIFs to be added to the video. By default, if you set this parameter and the vf_args parameter at the same time, the value of the vf_args parameter is invalid.
    Sample output:
    [2021-11-26T15:20:24.396] [INFO ] [S-CLI] - Start ...
    ========= FC invoke Logs begin =========
    static-master/target/lib ......
    ......
    FC Invoke End RequestId: 31ecddfa-4e41-44bb-9489-00708b07****
    
    Duration: 1302.08 ms, Billed Duration: 1303 ms, Memory Size: 256 MB, Max Memory Used: 256.00 MB
    ========= FC invoke Logs end =========
    
    FC Invoke Result:
    ok
    
    
    End of method: invoke

    Run the following command to debug the AudioConvert function:

    s AudioConvert invoke -e '{"bucket_name": "test-bucket","object_key": "a.mp4", "output_dir" : "output/", "dst_type":".wav", "ac":"1", "ar":"4000"}'

    Parameters:

    • bucket_name: the name of the OSS bucket that stores the audio file.
    • object_key: the name of the audio file whose format you want to convert.
    • output_dir: the name of the OSS bucket that is used to store the converted audio file.
    • dst_type: the converted format of the audio file.
    • ac: optional. The number of sound channels.
    • ar: optional. The audio sampling rate.
    Sample output:
    [2021-11-26T16:04:16.293] [INFO ] [S-CLI] - Start ...
    ========= FC invoke Logs begin =========
    ......
    2021-11-26T08:04:18.520Z 2fc578cd-8787-4681-ab21-a3f6b4ab1e2a [ERROR] returncode:1
    ......
    Duration: 1156.09 ms, Billed Duration: 1157 ms, Memory Size: 256 MB, Max Memory Used: 88.23 MB
    ========= FC invoke Logs end =========
    
    FC Invoke Result:
    ok
    
    
    End of method: invoke

    Run the following command to debug the VideoGif function:

    s VideoGif invoke -e '{"bucket_name": "test-bucket","object_key": "a.mp4", "output_dir" : "output/", "vframes": "5", "start": "0",  "duration": "2"}'

    Parameters:

    • bucket_name: the name of the OSS bucket that stores the video file.
    • object_key: the name of the video file that you want to convert to a GIF file.
    • output_dir: the name of the OSS bucket that is used to store the converted video file.
    • vframes: optional. The duration of the video clip to be converted to a GIF file after the start point specified by the start parameter.
    • start: optional. The specified start point of the video from which the video is converted. Default value: 0.
    • duration: optional. The duration of the video to be converted to a GIF file after the start point specified by the start parameter.
     
    Note: If you set the duration and vframes parameters at the same time, the value of the duration parameter prevails. By default, if neither of the two parameters is set, the entire video is converted to a GIF file.
     
    Sample output:
    [2021-11-26T15:27:26.647] [INFO ] [S-CLI] - Start ...
    ========= FC invoke Logs begin =========
    FunctionCompute python3 runtime inited.
    FC Invoke Start RequestId: a49fc8b4-ee8f-4e8b-9923-b8b41ced47cb
    2021-11-26T07:27:28.279Z a49fc8b4-ee8f-4e8b-9923-b8b41ced47cb [INFO] b'{"bucket_name": "test-bucket","object_key": "a.mp4", "output_dir" : "output/", "vframes": "5", "start": "0",  "duration": "2"}'
    2021-11-26T07:27:28.280Z a49fc8b4-ee8f-4e8b-9923-b8b41ced47cb [INFO] cmd = ffmpeg -y -ss 0 -t 2 -accurate_seek -i ......
    2021-11-26T07:27:30.150Z a49fc8b4-ee8f-4e8b-9923-b8b41ced47cb [INFO] Uploaded /tmp/a.gif to output/a.gif
    2021-11-26T07:27:30.151Z a49fc8b4-ee8f-4e8b-9923-b8b41ced47cb [INFO] current Function [handler] excute time is 1.87 seconds
    FC Invoke End RequestId: a49fc8b4-ee8f-4e8b-9923-b8b41ced47cb
    
    Duration: 2495.95 ms, Billed Duration: 2496 ms, Memory Size: 512 MB, Max Memory Used: 85.32 MB
    ========= FC invoke Logs end =========
    
    FC Invoke Result:
    ok
    
    End of method: invoke