🗣️ Zonos 2

Zyphra's ZONOS2 — an expressive multilingual text-to-speech model with high-fidelity voice cloning, trained on 6M+ hours of speech. Upload or record a few seconds of a voice and it will speak your text. All generated audios have a little glitch at the end that you will have to cut after manually (tested on Japanese).

Language
Default voices
1 3600
Example texts