The LVM-Image-Face-Ratio Filter (DLC) component filters images whose face-to-image area ratio falls outside a specified range. Use it to remove images dominated by faces or images with no meaningful face content before training image generation models.
This component requires a GPU instance type. Select a GPU instance when configuring the resource group.
Supported computing resources
How it works
For each image, the component calculates the proportion of faces in the image. Images whose face ratio falls outside the configured minimum and maximum are filtered out. The remaining images are written to the output path you specify.
Adjust the minimum and maximum face ratio range based on your dataset and training objective.
Inputs and outputs
Input ports
The component accepts the following inputs:
Read File Data component — reads the Object Storage Service (OSS) path where training data is stored.
Image Data OSS Path parameter — select either an OSS directory containing image files or an existing
meta.jsonlmetadata file. See the parameter description below.Any image data preprocessing component — connect it as an upstream input.
Output port
Filtering results written to the OSS directory specified by Output File OSS Path. See the parameter description below for output file details.
Configure the component
Configure the LVM-Image-Face-Ratio Filter (DLC) component in Machine Learning Designer. The following table describes all parameters.
| Tab | Parameter | Type | Required | Default | Description |
|---|---|---|---|---|---|
| Field Settings | Image Data OSS Path | String | No | — | OSS directory containing image data, or an existing meta.jsonl file. If no upstream component is connected on the first run, select the OSS directory manually. The component generates meta.jsonl in the parent directory of the specified path. On subsequent runs, select meta.jsonl directly. |
| Field Settings | Output File OSS Path | String | Yes | — | OSS directory where filtering results are stored. The output includes: {name}.jsonl (filtered output, named by Output Filename), {name}_stats.jsonl (statistics), and dj_run_yaml.yaml (algorithm run configuration). |
| Field Settings | Output Filename | String | Yes | result.jsonl | File name for the filtering output. |
| Parameter Settings | Minimum face Ratio | Float | Yes | 0.0 | Minimum face ratio. Images with a face ratio below this value are filtered out. |
| Parameter Settings | Maximum face Ratio | Float | Yes | 0.4 | Maximum face ratio. Images with a face ratio above this value are filtered out. |
| Execution Tuning | Number of Processes | Integer | Yes | 4 | Number of parallel processes. |
| Select Resource Group | Public Resource Group | — | No | — | GPU instance type and virtual private cloud (VPC) to use. Select a GPU instance type. |
| Select Resource Group | Dedicated resource group | — | No | — | Number of vCPUs, memory, shared memory, and GPUs to allocate. |
| Select Resource Group | Maximum Running Duration (seconds) | Integer | No | — | Maximum run time in seconds. The job is terminated if this limit is exceeded. |
Usage notes
First run without upstream component: If no upstream component provides the OSS path, set Image Data OSS Path to the OSS directory containing your images. The component creates
meta.jsonlin the parent directory on the first run. Use this file as the input for subsequent runs instead of rescanning the directory.Face ratio range: Set the minimum and maximum face ratio to control which images are retained. Images with a face ratio below the minimum or above the maximum are filtered out. Adjust the range based on your dataset and training objective.
GPU requirement: This component uses GPU-accelerated face detection. Always select a GPU instance type in the resource group configuration.