LivePortrait quickly generates lightweight, dynamic portrait videos from portrait images and human voice audio files. It includes two independent models, "LivePortrait-detect" and "LivePortrait", which provide portrait image compliance detection and portrait video generation, respectively.
This document applies only to the China (Beijing) region. To use the models, you must use an API key from the China (Beijing) region.
Model overview
Model introduction
"LivePortrait-detect" is an image detection model that verifies whether an input image meets the specifications required by the "LivePortrait" model.
"LivePortrait" is a video generation model that quickly generates lightweight, dynamic portrait videos from portrait images and human voice audio files.
Performance showcase
Inputs: Human portrait image + Human voice audio file | Outputs: Dynamic human portrait video |
Human portrait:
Human voice audio: See the video on the right | Human video: |
Human portrait:
Human voice audio: See the video on the right | Human video: |
Human portrait:
Human voice audio: See the video on the right | Human video: |
The materials in the preceding examples are AI-generated.
Billing and rate limiting
Model | Unit price | QPS limit for task submission API | Number of concurrent tasks |
liveportrait-detect | Model call, pay-as-you-go: $0.000574/image | 5 | No limit for sync APIs |
liveportrait | Model call, pay-as-you-go: $0.002868/second | 1 (At any given time, only one job is running. Other jobs in the queue are waiting.) |
Prerequisites
You have activated Alibaba Cloud Model Studio and obtained an API key. For more information, see Preparations: Get and configure an API key.
Model call
The LivePortrait models can be called on a pay-as-you-go basis.
To call the models, follow these steps:
Call the "LivePortrait-detect" model to confirm that the input image meets the specifications. For more information, see LivePortrait image detection.
Call the "LivePortrait" model with the image that passed the detection and an audio file containing a clear human voice to generate a dynamic portrait video. For more information, see LivePortrait video generation.


