The LVM-Image-Text-Matching Filter (DLC) component is used to filter the data of an image that has an excessively low text-image matching score.
This component requires a GPU instance type. When you configure the resource group, make sure you select a GPU instance.
Supported computing resources
How the algorithm works
The LVM-Image-Text-Matching Filter (DLC) component compares the description text of an image with the description text in training data and calculates the matching score of the image based on blip-itm-base-coco. This way, the component filters the data of an image that has an excessively low text-image matching score to ensure the quality of the image. The description text in training data is the content that follows the <__dj__video> field in the training data file. In most cases, the component is used for the subsequent training of image generation models.
Input data format
The input is a JSONL file. Each line contains a JSON object with the following fields:
images: The OSS path of the image.
text: The description text. The
<__dj__image>field marks the start of the description text, and the<|__dj__eoc|>field marks the end.

Inputs and outputs
Input ports
The Read File Data component reads the OSS path where the training data is stored.
You can configure the Image Data OSS Path parameter to select the training data file.
For more information about the training data file, see How the algorithm works.
Output port
The component produces the filtering results. For details on the output files, see the Output File OSS Path parameter in Field Settings.
Configure the component
You can configure the parameters of the LVM-Image-Text-Matching Filter (DLC) component in Machine Learning Designer. The parameters are grouped into the following tabs.
Field Settings
| Parameter | Required | Description | Default value |
|---|---|---|---|
| Image Data OSS Path | No | The training data file. For more information, see How the algorithm works. | No default value |
| Output File OSS Path | Yes | The OSS directory where the filtering results are stored. The directory contains the following output files:
| No default value |
| Output Filename | Yes | The file name for the filtering results. | result.jsonl |
Parameter Settings
| Parameter | Required | Description | Default value |
|---|---|---|---|
| Minimum Text-Frame Matching Score | Yes | The minimum text-image matching score. Images with scores below this threshold are filtered out. | 0.1 |
| Maximum Text-Frame Matching Score | Yes | The maximum text-image matching score. In most cases, set this parameter to 1. | 1 |
Execution Tuning
| Parameter | Required | Description | Default value |
|---|---|---|---|
| Select Resource Group - Public Resource Group | No | The instance type (CPU or GPU) and virtual private cloud (VPC) to use. You must select a GPU instance type for this algorithm. | No default value |
| Select Resource Group - Dedicated resource group | No | The number of CPU cores, memory, shared memory, and GPUs to use. | No default value |
| Maximum Running Duration (seconds) | No | The maximum time the component can run. If the component exceeds this duration, the job is terminated. | No default value |