Emoji (Image to emoji video) - Alibaba Cloud Model Studio

Emoji creates expressive videos from a portrait or half-body image and a preset dynamic template.

Important

This document applies only to the China (Beijing) region. An API key for the China (Beijing) region is required to use the model.

Performance showcase

Input: Portrait image	Output: Dynamic portrait video
	driven_id: jingdian_xianqi
	driven_id: mengwa_kaixin

Emoji video generation requires two API calls: check portrait compliance and then generate the video.

Step 1: Image detection

Call the Emoji image detection API to verify portrait compliance and retrieve face and dynamic area coordinates.
Step 2: Video generation

Call the Emoji video generation API with the portrait image, coordinates from step 1, and a template ID from the list of template IDs.