ApsaraVideo VOD Workflow 2.0 integrates with the caption removal feature of Intelligent Media Services (IMS) to intelligently detect and remove video captions. This feature uses smart padding to restore a clean video frame, which automates the process and supports post-production. This topic describes how to use the caption removal feature in a workflow.
This feature is available in the following regions:
China (Shanghai), China (Beijing), Asia Pacific SE 1 (Singapore), and US (US West).
Prerequisites
To use the caption removal node in a workflow, you must activate ApsaraVideo VOD and Intelligent Media Services.
Benefits
A Premium Edition of the caption removal feature is available:
Premium Edition (Recommended): This edition provides seamless removal. It resolves issues such as mosaic and shadow artifacts to produce a more natural-looking video. You can enable this edition using the
ModelIdparameter. For more information, see The ModelId parameter.The Premium Edition uses an advanced algorithm. However, it processes videos more slowly and costs more than the Basic Edition. You can select an edition based on your requirements.
Instructions
Step 1: Configure a caption removal workflow in the console
You can create workflows only in the ApsaraVideo VOD console.
Log on to the ApsaraVideo VOD console.
In the navigation pane on the left, choose Configuration Management > Media Processing Settings > Workflow Management.
Click Add Workflow Template and enter a workflow name.
In the workflow editor, click the + icon to the right of the Start node to add a caption removal node.

In the panel on the right, configure the following parameters:

Node Name: The custom name for the caption removal node.
Sample Material: A sample video used to configure caption parameters. This video is not processed by the workflow. Supported formats include MP4, WebM, MOV, and M3U8.
Detection Area: The default area is the bottom quarter of the video. You can also manually select a custom area.
Algorithm Edition: Supports Basic Edition and Premium Edition.
Premium Edition (Recommended): This edition supports seamless removal for multiple time ranges. It produces a more natural-looking video and effectively eliminates residual mosaics and shadows. You can enable the premium features by setting the
ModelIdparameter. For more information, see The ModelId parameter.Basic Edition: This edition may leave residual artifacts, such as mosaic shadows, after removal. The restoration quality is average.
Time Range: Set a custom start and end time.
NoteThe start time and end time must be specified in pairs.
The start time cannot be later than the end time.
If either the start time or end time is left empty, the entry is considered invalid. An invalid entry defaults to the entire time range of the video.
The Premium Edition supports up to five time ranges. The Basic Edition supports only one.
After you complete the configuration, click OK. Then, submit the workflow template to generate a workflow ID. Record the ID so you can specify the workflow for later uploads.

Step 2: Trigger the workflow
You can use the created workflow to process videos. You can trigger the workflow during or after a video upload.
Trigger a workflow in the console
Initiate during video upload
Log on to the ApsaraVideo VOD console.
In the navigation pane on the left, choose Media Library > Audio/Video and then click Upload Audio/Video.
On the Upload Audio/Video page, click Add Audio/Video. Select an upload method and storage address. Then, select Process with Workflow and specify the previously created workflow.

Initiate after video upload
Log on to the ApsaraVideo VOD console.
In the navigation pane on the left, choose Media Library > Audio/Video.
In the Actions column of the target audio or video, click Media Processing. Select Process with Workflow, and then select the workflow that you created in the previous step.

Trigger a workflow using OpenAPI
Initiate during video upload
The CreateUploadVideo operation is used only to obtain an upload URL and credential, and to create basic media asset information. It does not upload the file. You must implement the upload logic. For a complete example of how to upload a file using an API operation, see Upload media files using ApsaraVideo VOD API.
When you call the CreateUploadVideo or UploadMediaByURL operation to upload an audio or video file, set the
WorkflowIdparameter to the ID of the workflow that you created. After the upload is complete, ApsaraVideo VOD automatically processes the file according to the specified workflow.
Initiate after video upload
You can call the SubmitWorkflowJob operation. Set the WorkflowId parameter to the ID of the workflow that you created. This action immediately starts workflow processing for the audio or video file.
Step 3: Query Results
Query through the ApsaraVideo VOD console
Log on to the ApsaraVideo VOD console.
In the left navigation pane, choose Media Library > Audio/Video to open the Audio/Video page.
On the audio and video list page, find the inpainted video that is generated by the workflow. You can filter the videos by source video name or creation time.
Query through the Intelligent Media Services console
Log on to the Intelligent Media Services console.
In the navigation pane on the left, choose VOD Media Processing > Task Management. On the Task Management page, click the Intelligent Inpainting tab.
On the Intelligent Inpainting tab, find the inpainting task that the workflow generated. You can filter the tasks by name or creation time.
When the task's status is Successful, click View to see the details:
Basic parameters and configuration of the inpainting task.
Input information for the inpainted video.
Deletes the video output information and the output file.
Query through workflow task callbacks
Configure HTTP or Message Service (MNS) callbacks.
When the workflow task is complete, the system triggers a WorkflowTaskFinished event and pushes the complete results through the configured HTTP or MNS callback. The key fields are described as follows:
Status: The overall status of the task, which can beSucceedorFailed.ActivityResults: A JSON string that contains the execution details of each node. For example, for aVideoTranslationnode, theResultfield contains key information about the translation output, such asMediaIdandJobId.TaskInput: The original input media information, such as theMediaID and filename.
By parsing
ActivityResultsfrom the callback message body, you can extract theMediaIdof the translated video and use it for playback or distribution.
Query through OpenAPI
You can call the QueryIProductionJob operation to query the task result. Pass the JobId of the inpainting task. You can obtain the job ID from ActivityResults.Result.JobId in the workflow task. The operation returns the detailed status and output of the inpainting job.
JobId: The ID of the job that is returned when the workflow calls the SubmitIProductionJob operation to submit the inpainting task.
The following code provides a sample response:
{
"RequestId": "****20b48fb04483915d4f2cd8ac****",
"JobId": "****20b48fb04483915d4f2cd8ac****",
"FunctionName": "VideoDetext",
"Input": {
"Type": "OSS",
"Media": "oss://example-bucket/input.mp4"
},
"Output": {
"Type": "OSS",
"Media": "oss://example-bucket/output.mp4",
"OutputUrl": "http://example-bucket.oss-cn-shanghai.aliyuncs.com/output.mp4"
},
"Status": "Success",
"CreateTime": "2024-09-24T06:17:09Z",
"FinishTime": "2024-09-24T06:17:31Z",
"OutputFiles": ["output.mp4"],
"OutputUrls": ["http://example-bucket.oss-cn-shanghai.aliyuncs.com/output.mp4"],
"Result": {}
}If the value of
StatusisSuccess, the caption inpainting is complete.You can access the processed video using the URL in
OutputUrls.