Submit an intelligent production job using SubmitIProductionJob - Intelligent Media Services - Alibaba Cloud - Intelligent Media Services

Use the SubmitIProductionJob operation to submit an intelligent production job.

Operation description

This is an asynchronous API. When you submit a task, the API returns a task ID and queues the task for asynchronous processing. The final result is delivered via a callback. You can also query the task status by calling QuerySmartProductionTask.

Try it now

Try this API in OpenAPI Explorer, no manual signing needed. Successful calls auto-generate SDK code matching your parameters. Download it with built-in credential security for local usage.

Test

RAM authorization

The table below describes the authorization required to call this API. You can define it in a Resource Access Management (RAM) policy. The table's columns are detailed below:

Action: The actions can be used in the Action element of RAM permission policy statements to grant permissions to perform the operation.
API: The API that you can call to perform the action.
Access level: The predefined level of access granted for each API. Valid values: create, list, get, update, and delete.
Resource type: The type of the resource that supports authorization to perform the action. It indicates if the action supports resource-level permission. The specified resource must be compatible with the action. Otherwise, the policy will be ineffective.
- For APIs with resource-level permissions, required resource types are marked with an asterisk (*). Specify the corresponding Alibaba Cloud Resource Name (ARN) in the Resource element of the policy.
- For APIs without resource-level permissions, it is shown as All Resources. Use an asterisk (*) in the Resource element of the policy.
Condition key: The condition keys defined by the service. The key allows for granular control, applying to either actions alone or actions associated with specific resources. In addition to service-specific condition keys, Alibaba Cloud provides a set of common condition keys applicable across all RAM-supported services.
Dependent action: The dependent actions required to run the action. To complete the action, the RAM user or the RAM role must have the permissions to perform all dependent actions.

Action

Access level

Resource type

Condition key

Dependent action

ice:SubmitIProductionJob

create

*All Resource

*

None

Request parameters

Parameter	Type	Required	Description	Example
Name	string	No	The name of the job, which can be up to 100 characters long.	测试任务
FunctionName	string	Yes	The name of the algorithm function. Valid values: Cover: Generates a smart cover. VideoClip: Creates a video summary. VideoDelogo: Removes logos from a video. VideoDetext: Removes text from a video. CaptionExtraction: Extracts captions from a video. VideoGreenScreenMatting: Performs green screen keying for a video. FaceBeauty: Applies beauty filters to faces in a video. VideoH2V: Converts a horizontal video to a vertical video. MusicSegmentDetect: Detects chorus segments in music. AudioBeatDetection: Detects the beat of an audio track. AudioQualityAssessment: Assesses audio quality. SpeechDenoise: Reduces noise in speech audio. AudioMixing: Mixes audio tracks. MusicDemix: Separates vocals from accompaniment in music.	Cover
Input	object	Yes	The input media asset. You can specify an OSS file or a media asset ID. The requirements for input files vary by algorithm function. For more information, see the supplementary instructions.
Type	string	Yes	The type of input media. Valid values: `OSS`: An OSS file path. `Media`: A media asset ID.	OSS
Media	string	Yes	The OSS URL of the input file or the ID of the input media asset. The OSS URL can be in one of the following formats: `oss://<bucket>/<object>` `http(s)://<bucket>.oss-<regionId>.aliyuncs.com/<object>` In these formats, `<bucket>` is the name of an OSS bucket in the same region as your project, and `<object>` is the file path.	oss://bucket/object
Output	object	Yes	The output destination. You can specify an OSS file path or a media asset ID. The output files vary by algorithm function. For more information, see the supplementary instructions.
Type	string	Yes	The type of the output media. Valid values: `OSS`: An OSS file path. `Media`: A media asset ID.	OSS
Biz	string	No	The service to which the media asset belongs.	IMS
Media	string	Yes	The output destination. If `Type` is `OSS`, specify an OSS URL. If `Type` is `Media`, specify a media asset ID. The OSS URL can be in one of the following formats: `oss://<bucket>/<object>` `http(s)://<bucket>.oss-<regionId>.aliyuncs.com/<object>` In these formats, `<bucket>` is the name of an OSS bucket in the same region as your project, and `<object>` is the file path. Media asset ID: To use an existing media asset, specify its ID. The `Biz` parameter is not required; the service of the source media asset will be used. To create a new media asset, leave this field empty. The `Biz` parameter determines whether the asset is created in IMS or VOD. If `Biz` is not specified, the system uses the service of the source asset. If the source is not a media asset, the output defaults to IMS. Note The OSS file path supports placeholders. For example: `oss://example-****/iproduction/{source}-{timestamp}-{sequenceId}.png`. The following placeholders are supported: `{source}`: The name of the input file. `{timestamp}`: The Unix timestamp. `{sequenceId}`: A system-generated sequence ID. `{resultType}`: The type of output file, which the server determines. Placeholders are optional. However, for algorithms that produce multiple outputs, such as `Cover`, we recommend including a placeholder like `{sequenceId}` to ensure unique file paths and prevent overwriting existing files.	oss://bucket/object
OutputUrl	string	No	If `Type` is set to `Media`, you can use this parameter to specify the OSS URL for the output file. The bucket must be registered in either IMS or VOD.	http(s)://bucket.oss-[RegionId].aliyuncs.com/object
TemplateId	string	No	The ID of the template.	**20b48fb04483915d4f2cd8ac**
JobParams	string	No	The algorithm job parameters, specified as a JSON-formatted string. The content of the JSON object varies by algorithm function. For more information, see the supplementary instructions.	{"Model":"gif"}
ScheduleConfig	object	No	The configuration for job scheduling.
PipelineId	string	No	The ID of the pipeline.	5246b8d12a62433ab77845074039c3dc
Priority	integer	No	The job priority, which can be an integer from 1 to 10. A smaller value indicates a higher priority.	6
UserData	string	No	Custom user data. The system passes this data through and returns it as-is in the callback or response. The length cannot exceed 256 characters.	{"test":1}
ModelId	string	No	The ID of the algorithm model. If you do not specify this parameter, the system uses the default model for the selected function. We recommend leaving this parameter empty unless you need to use a specific alternative model. The following function offers an alternative model: `VideoDetext` Set `ModelId` to `algo-video-detext-new` to use an advanced subtitle removal algorithm. This model provides higher quality results but is slower and more expensive than the default model.

Input and output fields

Cover

Accepts a video file as input and outputs multiple images (three by default). You must use a placeholder to distinguish the output files. The output format is either PNG for a static image or GIF for an animated image, depending on the parameters specified in JobParams.

VideoDelogo

Removes logos from a video file and outputs an MP4 video.

VideoDetext

Removes subtitles from a video file and outputs an MP4 video.

CaptionExtraction

Extracts subtitles from a video file and outputs them as an SRT file.

VideoGreenScreenMatting

Takes a video file as input and outputs a video with the green screen background removed. The output format is either MP4 or WebM, depending on the parameters in JobParams.

FaceBeauty

Takes a video file as input and outputs a video in MP4 format with face beautification effects.

VideoH2V

Takes a video file as input and outputs a vertical MP4 video converted from a horizontal source.

MusicSegmentDetect

Takes an audio file as input and outputs a JSON file containing chorus detection results.

AudioBeatDetection

Takes an audio file as input and outputs a JSON file containing beat detection results.

AudioQualityAssessment

Takes an audio file as input. No output file is generated. The audio quality assessment results are returned directly in the QueryIProductionJob response.

SpeechDenoise

Reduces noise in an audio file and outputs the result as a WAV file.

AudioMixing

Mixes the input audio file with additional tracks specified in JobParams and outputs the result in WAV format.

MusicDemix

Separates an audio file (song) into two separate audio files: one for the vocal and one for the accompaniment. The Output path must include the {resultType} placeholder to distinguish between the two files.

`JobParams` JSON fields

Cover

Model: String. Specifies the Smart Cover model. If left empty, a static image is generated. If set to gif, an animated image is generated.

VideoDelogo

LogoModel: String. Specifies the type of logo to remove. Valid values: tv (television broadcast logos) and internet (online media logos). You can specify multiple values, separated by commas.
Boxes: String. The coordinates of the bounding boxes for the target logos. Coordinates are specified as normalized distances from the upper-left corner in the format [xmin, ymin, width, height]. A maximum of two bounding boxes are supported. Example: "[[0, 0, 0.3, 0.3], [0.7, 0, 0.3, 0.3]]".

VideoDetext

LimitRegion: List. Specifies the regions for subtitle detection. Coordinates are specified as normalized distances from the upper-left corner in the format [xmin, ymin, width, height]. Multiple regions are supported. Example: [[0, 0, 0.3, 0.3], [0.7, 0, 0.3, 0.3]]. Note: If this parameter is omitted, the service defaults to scanning the bottom 30% of the video.
Time: List. Specifies the time ranges for subtitle removal in seconds, in the format [start_time, end_time]. For example, [5, 20] indicates that subtitles are removed only between 5 and 20 seconds into the video.
- A one-dimensional array, such as [5, 20], to specify a single time range.
- A two-dimensional array, such as [[5, 20], [25, 43], [51, 80]], to specify multiple time ranges. This format is supported only when modelId is algo-video-detext-new.

CaptionExtraction

fps: Integer. (Optional) The sampling frame rate. Valid range: [2, 10]. Default: 5.
roi: The region of interest (ROI) for subtitle extraction. Only subtitles within this region are extracted. The format is [[top, bottom], [left, right]], using normalized values. For example, [[0.5, 1], [0, 1]] specifies the bottom half of the video. If this parameter is omitted, the service defaults to extracting subtitles from the bottom quarter of the video.
lang: String. The language for recognition. Valid values: ch (Chinese), en (English), and ch_ml (mixed Chinese and English). Default: ch.
track: String. If set to main, only the main subtitle track is extracted. If this parameter is omitted, all subtitles within the specified region are extracted by default.

VideoGreenScreenMatting

bgimage: String. The URL of the background image to overlay on the video after matting. Example: http://example-image-****.example-location.aliyuncs.com/example/example.jpg. If this parameter is omitted, the service outputs a WebM video with an alpha channel.

FaceBeauty

beauty_params: String. The parameters for the face beautification feature. Example: "whiten=20,smooth=50,face_thin=50". For more information, see Parameter Field Descriptions.

VideoH2V

None

MusicSegmentDetect

None

AudioBeatDetection

None

AudioQualityAssessment

None

SpeechDenoise

The input audio file must be in WAV format with a sampling rate of 16 kHz or 48 kHz.

AudioMixing

inputs: A list containing the audio tracks to mix. Note: Currently, only one file is supported. Example: {"file":"http://example-bucket-****.oss-cn-shanghai.aliyuncs.com/2.mp4"}

MusicDemix

None

Response elements

Element	Type	Description	Example
	object	The data returned.
RequestId	string	The ID of the request.	C1849434-FC47-5DC1-92B6-F7EAAFE3851E
JobId	string	The ID of the job.	**20b48fb04483915d4f2cd8ac**

Examples

Success response

JSON format

{
  "RequestId": "C1849434-FC47-5DC1-92B6-F7EAAFE3851E",
  "JobId": "****20b48fb04483915d4f2cd8ac****"
}

Error codes

See Error Codes for a complete list.

Release notes

See Release Notes for a complete list.

Operation description

Try it now

RAM authorization

Request parameters

Input and output fields

Cover

VideoDelogo

VideoDetext

CaptionExtraction

VideoGreenScreenMatting

FaceBeauty

VideoH2V

MusicSegmentDetect

AudioBeatDetection

AudioQualityAssessment

SpeechDenoise

AudioMixing

MusicDemix

JobParams JSON fields

Cover

VideoDelogo

VideoDetext

CaptionExtraction

VideoGreenScreenMatting

FaceBeauty

VideoH2V

MusicSegmentDetect

AudioBeatDetection

AudioQualityAssessment

SpeechDenoise

AudioMixing

MusicDemix

Response elements

Examples

Error codes

Release notes

`JobParams` JSON fields