All Products
Search
Document Center

Alibaba Cloud Model Studio:Recommended models

Last Updated:Jun 22, 2026

Alibaba Cloud Model Studio offers Qwen and third-party models for text, image, audio, and video.

Text generation

Qwen models

From most capable to most cost-effective — pick what fits your use case

More →

Image & video

Understanding

Extract text descriptions or structured data from images and videos

More →

Generation

Generate images and videos from text or images, with support for editing, reference, and high-resolution output

More →

Audio & speech

Text-to-speech

For audiobook reading, voice broadcasting, virtual avatars, and more

More →

Music generation

Generate music from prompts or lyrics

More →

Speech recognition

Dedicated ASR and LLM-based approaches — choose based on accuracy and flexibility

More →

Speech-to-speech

End-to-end voice conversation without separate ASR and TTS calls

More →

Omni

Integrates understanding and generation capabilities across text, image, audio, and video modalities

More →

Embeddings & reranking

Convert text or multimodal content into vectors, combined with reranking to improve retrieval accuracy

More →

View all models

Go to Model Plaza to browse all Qwen, third-party, domain-specific, and legacy models.