All Products
Search
Document Center

Platform For AI:LVM-Text-Frame-Similarity Filter (DLC)

Last Updated:Feb 07, 2025

The LVM-Text-Frame-Similarity Filter (DLC) component of Platform for AI (PAI) is used to filter the data of a video that has low similarity. Only MP4 videos can be processed.

Supported computing resources

Deep Learning Containers (DLC)

Algorithm

The LVM-Text-Frame-Similarity Filter (DLC) component calculates the similarity between the description of sampled video frames and the description text in the training data. The description text is the content that follows the <__dj__video> field in Training data. This way, the component can filter data of a video that has low similarity to ensure the quality of the video. In most cases, the component is used for the subsequent training of video generation models.

The input is a JSONL file. The <__dj__image> field is the start marker of the description text and the <|__dj__eoc|> field is the end marker of the description text.

image

  • The images field is the OSS path of the image.

  • The text field is the description text.

Inputs and outputs

Input ports

  • The Read File Data component is used to read the OSS directory in which the training data is stored.

  • You can configure the OSS Data Path parameter to select the training data file.

For more information about the training data file, see Algorithm.

Output port

The filtering results. For more information, see the parameter description in the following section.

Configure the component

You can configure the parameters of the LVM-Text-Frame-Similarity Filter (DLC) component in Machine Learning Designer. The following table describes the parameters.

Tab

Parameter

Required

Description

Default value

Field Settings

Video Data OSS Path

No

The training data file. For more information, see Algorithm.

No default value

Output File OSS Path

Yes

The OSS directory in which the filtering results are stored. The results include the following files:

  • {name}.jsonl: the output file. You can configure the Output Filename parameter to specify the output file.

  • {name}_stats.jsonl: the state file.

  • dj_run_yaml.yaml: the parameter configuration file used when the algorithm runs.

No default value

Output Filename

Yes

The file name of the filtering results.

result.jsonl

Parameter Settings

Minimum Text-Frame Similarity Score

Yes

The minimum text-image similarity.

0.1

Maximum Text-Frame Similarity Score

Yes

The maximum text-image similarity.

1

Number of Sampled Frames

Yes

The number of video frames that are sampled. The system evenly collects frames in a video for analysis based on the video duration.

3

Execution Tuning

Select Resource Group

Public Resource Group

No

The instance type (CPU or GPU) and virtual private cloud (VPC) that you want to use. You must select the GPU instance type for the algorithm.

No default value

Dedicated resource group

No

The number of vCPUs, memory, shared memory, and number of GPUs that you want to use.

No default value

Maximum Running Duration (seconds)

No

The maximum period of time for which the component can run. If the specified period of time is exceeded, the job is terminated.

No default value