We just released Maya-1-Voice, an open source voice AI model with voice design and emotions.
Describe voices in natural language. Add 20+ emotions like <laugh>, <cry>, <whisper> inline. 3B parameters, production-ready, runs on single GPU with vLLM.
Apache 2.0. Built on Llama backbone, predicts SNAC codec tokens for real-time streaming.
fine-tuning a 14B model with TRL + SFT on a free Colab (T4 GPU)? thanks to the latest TRL optimizations, you actually can! sharing a new notebook showing how to do it 😎