Seedance 2.0

image to video

High-quality audio and video generation model, stable picture, and synchronized audio and video.

Text Friendly
Audio Support
Commercial Use

Input

First frame*

Drag and drop media files from your computer, paste from clipboard (Ctrl/Cmd+V), or provide a URL. Accepted: .jpg, .jpeg, .png, .webp

Tail frame image

Drag and drop media files from your computer, paste from clipboard (Ctrl/Cmd+V), or provide a URL. Accepted: .jpg, .jpeg, .png, .webp

Result

Idle

Ready to Generate

Configure your inputs and click run to generate an image preview.

Seedance 2.0 Model Introduction

Model Overview

Seedance 2.0 is a cinematic multimodal audio and video co-generation model developed by ByteDance's Seed team. It employs an innovative dual-branch diffusion transformer (DB-DiT) architecture, supporting mixed input of four modalities: text, images, audio, and video. It can load up to 12 reference files, including 9 images, 3 video clips, and 3 audio clips, and outputs 2K resolution video and native stereo sound in a single forward propagation, completely resolving industry pain points such as audio-visual timing misalignment and lip-sync asynchrony. The model possesses powerful 3D spatial awareness and dynamic memory capabilities, exhibiting stable motion, physical realism, and strong subject consistency. It can automatically complete multi-shot narratives, storyboard design, and smooth camera movements, accurately reproducing complex scripts and director-level creative intentions. It leads the industry in instruction compliance, visual aesthetics, and audio reproduction, deeply adapting to professional scenarios such as film, advertising, and social media marketing. It can efficiently produce high-quality audiovisual content that meets industrial delivery standards, significantly reducing content creation costs and timelines.

Pricing

ResolutionCredits Consumed
480p(credits/s)6
720p(credits/s)12

Technical Specifications

ParameterSpecification
Core Capabilityimage_to_video
Resolution480p,720p
Aspect Ratio16:9,4:3,1:1,3:4,9:16,21:9
Duration4,5,6,7,8,9,10,11,12,13,14,15
License