The Emoji video generation model creates expressive videos from a portrait or half-body profile picture and a preset dynamic template.
This document applies only to the China (Beijing) region. An API key for the China (Beijing) region is required to use the model.
Performance showcase
Input: Portrait image | Output: Dynamic portrait video |
| driven_id: jingdian_xianqi |
| driven_id: mengwa_kaixin |
Video generation flow
Generating an emoji video requires two API calls. The first call checks the portrait image for compliance. The second call generates the dynamic emoji video.
Step 1: Image detection
Call the Emoji image detection API to ensure that the portrait image is compliant and to get the coordinates of the face area and the dynamic area.
Step 2: Video generation
Call the Emoji video generation API. Pass the portrait image, the coordinates in the previous step, and a template ID from the list of template IDs.
Billing and rate limits
For the free quota and unit price for the model, see Models and pricing.
For rate limits, see Rate limits.

