×
Community Blog Alibaba Rolls Out HappyHorse 1.0 in Limited Beta

Alibaba Rolls Out HappyHorse 1.0 in Limited Beta

Alibaba launched limited beta access to its HappyHorse 1.0, a video generation model designed to help creators produce high-quality, cinematic-style video content.

1

Alibaba today launched limited beta access to its HappyHorse 1.0, a video generation model designed to help creators produce high-quality, cinematic-style video content. HappyHorse 1.0 is now accessible to creators and enterprise customers globally via HappyHorse official website and through API service on Alibaba Cloud Model Studio, while individual users can experience the model through Alibaba’s consumer facing AI application Qwen App.

Developed by Alibaba Token Hub (ATH) Business Unit, HappyHorse 1.0 features stong video generation and editing capabilities. It supports multimodal input and flexible creative workflows, delivering physically convincing simulations with strong semantic understanding and instruction-following, audio-visual synchronization and multi-shot sequencing, and exceptional aesthetic expression capabilities. This model produces cinematic video output, making it well-suited for professional use such as advertising, e-commerce, short-form video, and social media marketing.

Advanced Video Generation and Editing with Exceptional Aestetic Expression

HappyHorse 1.0 supports Text-to-Video (T2V), Image-to-Video (I2V), and Subject-to-Video (S2V) generation — enabling users to create video from a text prompt, animate a still image into a video clip, or insert a specific subject from a reference image into a generated video while preserving their appearance and identity. The model supports generation of up to 15 seconds of 1080p video with multiple shots, and delivers synchronized audio-visual output — including lip-synced dialogue, ambient soundscapes, and emotionally expressive vocal performances — for a fully immersive viewing experience. Excels in cinematic framing with wide apertures, the model can easily convey a strong atmospheric mood, delivering refined texture and detail, as well as rich spatial depth and visual layering.


T2V Prompt: A cinematic script scene set in a sun-drenched Parisian café, golden afternoon light spilling through arched windows. A sharp-dressed man in a tailored navy suit sits across from an elegant woman in a flowing crimson dress, half-empty coffee cups between them. The air is thick with unspoken tension. He leans forward, voice low and steady: “You knew from the beginning, didn’t you? That none of this was real.” She holds his gaze without flinching, a ghost of a smile on her lips, slowly stirring her coffee: “Everything was real. That’s exactly what makes it so dangerous.” Cinematic wide-angle composition, warm golden hour lighting, shallow depth of field, film grain texture, muted vintage color palette with deep crimson accents, highly detailed wardrobe and facial expressions, noir romantic aesthetic, emotionally charged atmosphere, European street photography style, dramatic storytelling, 35mm film look.


Beyond generation, HappyHorse 1.0 offers powerful video editing functions. The Video-to-Video (V2V) function allows users to modify an existing video while preserving its original structure and motion. The Subject-and-Video-to-Video (SV2V) function enables users to seamlessly replace or insert a specific subject from a reference image, while preserving the original video’s motion, composition, and unaffected regions.

2_jpeg
Top to bottom: original video, reference image, generated video. Text Prompt: Transform the entire video into the Minecraft voxel style based on the visual aesthetic of Image 1. Convert all subjects, characters, and the environment into 3D blocks with low-resolution pixelated textures. Ensure the lighting and colors match the blocky world shown in Image 1. Throughout this transformation, the original movements, character actions, and camera tracking path must remain 100% unchanged. The final result should look like the original scene has been completely rebuilt inside the Minecraft game world.


HappyHorse 1.0 is engineered for strong cinematic output, excelling in wide-aperture, shallow depth-of-field cinematography with atmospheric visual language and fine-grained image texture. The model excels in multi-shot consistency, maintaining stable character positioning across frequent cut transitions — making it well-suited for short dramas featuring camera movement and emotional atmosphere, such as suspenseful confrontation scenes and romance narratives.

3

Top to bottom: reference image, generated video. I2V Prompt: A boy and the rusty robot stand under the cool glow of the full moon, gently holding hands with a deep bond; a tight close-up captures the boy looking sincere and kind, his lips moving softly to whisper, “we are friends”; the robot’s luminous eyes flicker and pulse as it processes the message, responding in a stuttering, mechanical electronic voice, “we… are, we… are friends”; hearing this, the boy’s expression lights up with pure joy, and he reaches out his hand to kindly stroke and pat the robot’s weathered metal head; the camera pulls back to a wide shot.


This article was originally published on Alizila written by Claire Mo.

0 0 0
Share on

Alibaba Cloud Community

1,389 posts | 491 followers

You may also like

Comments

Alibaba Cloud Community

1,389 posts | 491 followers

Related Products