You can use the video classification component to train raw video data and obtain a video classification model for inference. This topic describes how to configure the video classification component and provides an example on how to use the component.

Prerequisites

OSS is activated, and Machine Learning Studio is authorized to access OSS. For more information, see Activate OSS and Grant PAI the permissions to access OSS.

Limits

This component is available only in Machine Learning Designer.

Introduction

The video classification component provides mainstream, three-dimensional convolutional neural networks (CNNs) for training video classification models. The supported video classification models include extensible 3D (X3D)-XS, X3D-M, and X3D-L.

You can find the video classification component in the Offline Training subfolder under the Video Algorithm folder of the component library.

Configure the component in the Machine Learning Platform for AI (PAI) console

  • Input ports
    Input port (left-to-right) Data type Recommended upstream component Required
    Training data Object Storage Service (OSS) Read File Data No. If no input port is used to pass the training data to the video classification component, you must set the oss path to train file parameter on the Fields Setting tab of the component. For more information, see the Component parameters table of this topic.
    Evaluation data OSS Read File Data No. If no input port is used to pass the evaluation data to the video classification component, you must set the oss path to evaluaton file parameter on the Fields Setting tab of the component. For more information, see the Component parameters table of this topic.
  • Component parameters
    Tab Parameter Required Description Default value
    Fields Setting oss path to save checkpoint Yes The OSS path in which the training model is stored. Example: oss://pai-online-shanghai.oss-cn-shanghai-internal.aliyuncs.com/test/test_video_cls. N/A
    oss path to train file No The OSS path in which the training data is stored. This parameter is required if no input port is used to configure the training data for the video classification component. Example: oss://pai-vision-data-hz/EasyMM/DataSet/kinetics400/train_pai.txt.

    If you use both an input port and this parameter to configure the training data for the video classification component, the training data that is configured by using the input port is preferentially used.

    N/A
    oss path to evaluaton file No

    The OSS path in which the evaluation data is stored. This parameter is required if no input port is used to configure the evaluation data for the video classification component. Example: oss://pai-vision-data-hz/EasyMM/DataSet/kinetics400/train_pai.txt.

    If you use both an input port and this parameter to configure the evaluation data for the video classification component, the evaluation data that is configured by using the input port is preferentially used.
    N/A
    oss path to pretrained model No The OSS path in which a pre-trained model is stored. We recommend that you use a pre-trained model to improve the training accuracy. N/A
    Parameters Setting video classification network Yes The network that is used by the model. Valid values:
    • x3d_xs
    • x3d_l
    • x3d_m
    x3d_xs
    numclasses Yes The number of categories. N/A
    learning rate Yes The initial learning rate. 0.1
    warmup start learning rate Yes The initial learning rate for warmup. 0.01
    number of train epochs Yes The number of training iterations. 10
    warmup epoch Yes The number of warmup iterations. We recommend that you set the initial learning rate for warmup to a small value so that the value of the learning rate parameter can be reached only after the specified number of warmup iterations are implemented. This prevents the model gradient from exploding. For example, if you set the warmup epoch parameter to 35, the learning rate of the model will be gradually increased to the value specified by the learning rate parameter after 35 warmup iterations. 35
    batch size Yes The size of a training batch. Specifically, this parameter specifies the number of data samples used in a single model iteration or training process. 32
    model save interval No The epoch interval at which a checkpoint is saved. A value of 1 indicates that a checkpoint is saved each time an epoch is complete. 1
    Tuning single worker or distributed on dlc No The mode in which the component is run. Valid values:
    • single_dlc: single worker on Deep Learning Containers (DLC)
    • distribute_dlc: distributed workers on DLC
    single_dlc
    gpu machine type No The GPU specifications to be used. 8vCPU+60GB Mem+1xp100-ecs.gn5-c8g1.2xlarge
    number of worker No The number of concurrent workers. 1
  • Output port
    Output port Data type Downstream component
    Output model The OSS path in which the output model is stored. The value is the same as that you specified for the oss path to save checkpoint parameter on the Fields Setting tab. The output model in the .pth format is stored in this OSS path. N/A

Compute engine

The video classification component supports only the DLC engine.

Example

The following figure shows a sample pipeline in which the video classification component is used. Sample experimentIn this example, configure the components in the preceding figure by performing the following steps:
  1. Use two Read File Data components as the upstream components of the video classification component to read video data files as the input training data and evaluation data for the video classification component. Specifically, set the OSS Data Path parameters of the two Read File Data components to the OSS paths of the video data files.
    The following figure shows the format of a video data file. Labeled file formatEach line in the file indicates a video storage path and a category label that are separated by a comma (,).
  2. Configure the training data and evaluation data as the input of the video classification component and set other parameters. For more information, see Configure the component in the Machine Learning Platform for AI (PAI) console.