avatar
OmniHuman v1.5
Create hyper-realistic talking avatars from a single portrait photo and audio file. Features perfect lip sync, natural facial expressions, and gesture generation synchronized to speech rhythm and emotion.
Features
- Single Photo Input
- Perfect Lip Sync
- Natural Expressions
- Gesture Generation
- Turbo Mode
- 720p/1080p Output
Specifications
- Resolution
- 720p or 1080p
- Input
- Portrait photo + Audio file
- Audio Limit
- 30s at 1080p, 60s at 720p
- Output
- MP4 Video
Input Requirements
Portrait Photo*
image upload
Clear, front-facing portrait photo
Audio File*
audio upload
Speech audio to sync (max 30s at 1080p)
Scene Description (optional)(optional)
textarea
Turbo Mode(optional)
checkbox
Resolution(optional)
select