--- language: - ar - be - bg - bn - cs - cy - da - de - el - en - es - et - fa - fi - fr - gl - hi - hu - it - ja - ka - lt - lv - mk - mr - nl - pl - pt - ro - ru - sk - sl - sr - sv - sw - ta - th - tr - uk - ur - vi - zh library_name: transformers license: mit metrics: - bleu pipeline_tag: audio-text-to-text --- Test ultravox model. More coming soon... I hope so. ```python import transformers import numpy as np import librosa pipe = transformers.pipeline(model='AtAndDev/UVOX-96k-Llama-3.2-3B-Instruct', trust_remote_code=True, device="cuda") path = "voice_input.mp3" audio, sr = librosa.load(path, sr=16000) turns = [] pipe({'audio': audio, 'turns': turns, 'sampling_rate': sr}, max_new_tokens=100) ```