LVM-Image-Caption Mapper (DLC) - Platform For AI - Alibaba Cloud Documentation Center

The Image captioning algorithm is a model that integrate computer vision and natural language processing, designed to generate natural language descriptions for input images. It has wide-ranging applications in assisting visually impaired individuals, social media content creation, image search, e-commerce displays, and news releases, significantly enhancing information accessibility and user experience.

Supported computing resources

Deep Learning Containers (DLC)

Algorithm

The LVM-Image-Caption Mapper (DLC) component uses the Bootstrapping Language-Image Pre-training (BLIP) model to generate image text.

Inputs and outputs

Input ports

The Read File Data component is used to read the Object Storage Service (OSS) path in which the training data is stored.
You can configure the Image Data OSS Path parameter to select the OSS directory where the image data is stored or select the image metadata file. For more information, see the parameter description in the following section.
You can use any component for image data preprocessing as the input.

Output port

The results. For more information, see the parameter description in the following section.

Configure the component

You can configure the parameters of the LVM-Image-Caption Mapper (DLC) component in Machine Learning Designer. The following table describes the parameters.

Tab	Parameter		Required	Description	Default value
Field Settings	Image Data OSS Path		No	If no upstream component exists the first time you run this component, you must manually select the OSS directory in which the image data is stored. When the component runs, the image metadata file meta.jsonl is generated in the upper-level directory of the directory specified by this parameter. When you use the component to process the image data later, you can directly select the file meta.jsonl.	No default value
	Output File OSS Path		Yes	The OSS directory in which the results are stored. The results include the following files: {name}.jsonl: the output file. You can configure the Output Filename parameter to specify the output file. dj_run_yaml.yaml: the parameter configuration file used when the algorithm runs.	No default value
	Output Filename		Yes	The file name of the results.	result.jsonl
Parameter Settings	Number of Candidate Captions		Yes	The number of generated text candidates.	1
Execution Tuning	Select Resource Group	Public Resource Group	No	The instance type (CPU or GPU) and virtual private cloud (VPC) that you want to use. You must select the GPU instance type for the algorithm.	No default value
	Select Resource Group	Dedicated resource group	No	The number of vCPUs, memory, shared memory, and number of GPUs that you want to use.	No default value
	Maximum Running Duration (seconds)		No	The maximum period of time for which the component can run. If the specified period of time is exceeded, the job is terminated.	No default value