avatarNEW

Wan 2.2 S2V

Animate a still photo with a speech track using Wan 2.2 Speech-to-Video. The output follows your audio with natural, speech-driven motion at 480p, 580p, or 720p.

3 credits per generation

Try Wan 2.2 S2V

Generating withWan 2.2 S2V3c per generation

Created with Wan 2.2 S2V

Features

  • Photo + Audio Input
  • Speech-Driven Motion
  • 480p / 580p / 720p
  • Audio-Length Output

Specifications

Resolution
480p / 580p / 720p
Input
Photo + Audio + Prompt
Audio Limit
7.5s
Output
MP4 Video

Input Requirements

Source Photo*
image upload
Front-facing photo to animate
Audio File*
audio upload
Speech audio to drive the motion (max 7.5s)
Scene Description*
textarea
Resolution(optional)
select

Pricing

from 3 credits
~$0.50-$3.00 per generation

Related Models

Frequently Asked Questions

How much does Wan 2.2 S2V cost?

Wan 2.2 S2V costs 3 credits per generation (~$0.50-$3.00). You get 10 free credits every day to try it.

Can I use Wan 2.2 S2V outputs commercially?

Yes, all content generated with Wan 2.2 S2V on Arteza comes with a commercial license.

What file format does Wan 2.2 S2V output?

MP4 video files with lip-synced audio.