avatar

OmniHuman v1.5

Create hyper-realistic talking avatars from a single portrait photo and audio file. Features perfect lip sync, natural facial expressions, and gesture generation synchronized to speech rhythm and emotion.

960 credits per generation~60-90s

Features

  • Single Photo Input
  • Perfect Lip Sync
  • Natural Expressions
  • Gesture Generation
  • Turbo Mode
  • 720p/1080p Output

Specifications

Resolution
720p or 1080p
Input
Portrait photo + Audio file
Audio Limit
30s at 1080p, 60s at 720p
Output
MP4 Video

Input Requirements

Portrait Photo*
image upload
Clear, front-facing portrait photo
Audio File*
audio upload
Speech audio to sync (max 30s at 1080p)
Scene Description (optional)(optional)
textarea
Turbo Mode(optional)
checkbox
Resolution(optional)
select

Pricing

960 credits
~$9.60 per generation