All Products
Search
Document Center

Alibaba Cloud Model Studio:Emoji image detection API reference

Last Updated:Mar 15, 2026

Detects whether an input image meets Emoji model requirements. If detection passes, the model returns face area coordinates (bbox_face) and extended dynamic area coordinates (ext_bbox_face) for video generation.

Important

This document applies only to the China (Beijing) region. To use the model, you must use an API key for the China (Beijing) region.

Model overview

Model

Description

emoji-detect-v1

Detects whether an input image meets specifications required for Emoji video generation. The model returns face area (bbox_face) and extended expression area (ext_bbox_face) coordinates for video generation.

Input image requirements

Example of a compliant image (detection passed)

Image requirements

Compliant example

  • Single front-facing portrait

  • Face is not occluded (by objects such as hands, hair, or accessories)

  • Natural expression, no exaggerated expressions

  • Head is upright, without significant tilting

image.png

Examples of non-compliant images (detection failed)

Hand visible near the face

Face is occluded

Exaggerated expression

Excessive head tilt

image.png

image.png

image.png

image.png

Prerequisites

Get an API key and export the API key as an environment variable.

HTTP

POST https://dashscope.aliyuncs.com/api/v1/services/aigc/image2video/face-detect

Request parameters

Portrait compliance detection

curl --location 'https://dashscope.aliyuncs.com/api/v1/services/aigc/image2video/face-detect' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "model": "emoji-detect-v1",
    "input": {
        "image_url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250912/uopnly/emoji-image-detection.png"
    },
    "parameters": {
        "ratio":"1:1"
    }
  }'
Request headers

Content-Type string (Required)

The content type of the request. Must be application/json.

Authorization string (Required)

The authentication credentials using a Model Studio API key.

Example: Bearer sk-xxxx

Request body

model string (Required)

Set to emoji-detect-v1.

input object (Required)

The input image to detect.

Properties

image_url string (Required)

Public URL of the image (HTTP/HTTPS supported).

Limits:

  • Image format: JPEG, JPG, PNG, BMP, or WEBP.

  • Image resolution: The width and height of the image must be between 400 and 7,000 pixels.

  • File size: No larger than 10 MB.

Example: https://help-static-aliyun-doc.aliyuncs.com/xxx.png.

parameters object (Required)

Detection parameters.

Properties

ratio string (Required)

Aspect ratio of detection area. For Emoji video, set to 1:1.

Example: 1:1.

Response parameters

Detection passed

When detection passes, save bbox_face and ext_bbox_face for use in subsequent Emoji video generation (input.face_bbox and input.ext_bbox parameters). Charges apply (see usage.image_count).

{
    "output": {
        "bbox_face": [212,194,460,441],
        "ext_bbox_face": [63,30,609,575]   
    },
    "usage": {
        "image_count": 1
    },
    "request_id": "78becbc4-f7f7-41ea-9e38-xxxxxx"
}

Detection failed

When detection fails, charges still apply (see usage.image_count). For troubleshooting, see Error Messages.

{
    "output": {
        "code": "InvalidFile.FacePose",
        "message": "The pose of the detected face is invalid, please upload other image with the expected oriention."
    },
    "usage": {
        "image_count": 1
    },
    "request_id": "ed0d0d8f-e55a-4144-b855-xxxxxx"
}

Request failed

Request failures do not incur charges (no usage.image_count returned). For troubleshooting, see Error Messages.

{
    "request_id": "5e1fefbd-fa7a-4e59-82a0-xxxxxx",
    "code": "InvalidParameter",
    "message": "Required body invalid, please check the request body format."
}

output object

Task output information.

Properties

bbox_face array of integer

Face area coordinates in pixels: [x1, y1, x2, y2] (upper-left and lower-right points). Returned only when detection passes.

Use for input.face_bbox in Emoji video generation API.

Example: [212,194,460,441].

ext_bbox_face array of integer

Extended expression area coordinates in pixels: [x1, y1, x2, y2] (upper-left and lower-right points). Returned only when detection passes.

Use for input.ext_bbox in Emoji video generation API.

Example: [63,30,609,575].

code string

Error code (returned when detection fails). See Error messages for details.

message string

Error message (returned when detection fails). See Error messages for details.

request_id string

Unique identifier for the request. Use for tracing and troubleshooting issues.

usage object

Output statistics.

Properties

image_count integer

Number of images detected (always 1, used for billing). Successful requests incur charges regardless of detection result; failed requests do not. See Model pricing for billing details.

Note

Non-compliant images still incur charges as detection was completed.

message string

Detailed error message. Returned only when the request fails. See error codes for details.

request_id string

Unique identifier for the request. Use for tracing and troubleshooting issues.

Billing and rate limiting

  • For the free quota and unit price for the model, see Emoji.

  • For rate limits, see Rate limits.

Error codes

For error troubleshooting, see Error messages.