Hangzhou, China, December 16, 2025 - Alibaba today unveiled the latest evolution of its visual generation models, the Wan2.6 series. It enables creators to appear in AI-generated videos as themselves and in their own voices with flexible multi-shot storytelling – new features designed to unlock creative possibilities for professional-grade content production with enhanced multi-person dialogue and extended duration for richer narratives.
The Wan2.6 series features a new reference-to-video generation model as well as comprehensive upgrades to its four existing models. Wan2.6-R2V enables users to upload a character reference video with both appearance and voice, utilizing text prompts to generate vivid new scenes starring that same character. Users can create videos featuring a person, animal or object, or even multiple subjects together, while preserving the distinctive look and sound of the original reference.
Powered by multimodal reference generation capabilities, Wan2.6-R2V is China’s first reference-to-video generation model that makes it possible for users to insert themselves or other subjects into AI-generated scenes with consistent visuals and audio. This changes the way short-form drama creators tell stories and streamlines their production process.
The Wan2.6 series also includes enhancements to its text-to-video model (Wan2.6-T2V), its image-to-video model (Wan2.6-I2V), and to its two image generation models (Wan2.6-image and Wan2.6-T2I).
The new models introduce intelligent multi-shot storytelling capabilities that allow for richer, more expressive narratives with visual consistency throughout. Its improved capabilities in audio-visual synchronization and audio-to-video generation deliver more realistic scenes with richer sound effects.
Supporting video outputs of up to 15 seconds, the models give creators more room to develop their stories. Combined with enhanced instruction-following precision and improved visual quality, they enable creators to produce cinematic-style content with professional-grade results.
For image generation, the Wan2.6 series enables users to create interleaved text-image output with advanced logical reasoning capabilities, to further support coherent visual storytelling. It also demonstrates outstanding capabilities in precise artistic style control, generating realistic portraits with remarkable fidelity and image editing. Advanced understanding of lengthy Chinese and English text prompts enables creators to produce high-quality, expressive visual content that captures nuance and artistic intent.
Users can access and deploy the models through Model Studio—Alibaba Cloud's AI development platform—and Wan’s official website. The models will also be integrated into Qwen App, Alibaba's flagship AI application.
First unveiled earlier this year, the Wan series has undergone continuous upgrades, reflecting Alibaba’s leadership and innovation in AI-driven multimedia technologies.
Qwen3-Omni-Flash-2025-12-01: Hear You. See You. Follow Smarter!
1,298 posts | 456 followers
FollowAlibaba Cloud Community - April 21, 2025
Alibaba Cloud Community - September 27, 2025
Alibaba Cloud Community - September 19, 2024
Alibaba Cloud Community - July 7, 2023
Alibaba Cloud Community - September 22, 2025
Alibaba Cloud Community - April 11, 2023
1,298 posts | 456 followers
Follow
Tongyi Qianwen (Qwen)
Top-performance foundation models from Alibaba Cloud
Learn More
Alibaba Cloud for Generative AI
Accelerate innovation with generative AI to create new business success
Learn More
AI Acceleration Solution
Accelerate AI-driven business and AI model training and inference with Alibaba Cloud GPU technology
Learn More
Platform For AI
A platform that provides enterprise-level data modeling services based on machine learning algorithms to quickly meet your needs for data-driven operations.
Learn MoreMore Posts by Alibaba Cloud Community