fixie-ai
/

ultravox-v0_5-llama-3_3-70b

Audio-Text-to-Text

feature-extraction

Model card Files Files and versions

patricklifixie commited on Sep 12

Commit

adbc420

·

verified ·

1 Parent(s): 09b0e5d

Update README.md

Files changed (1) hide show

README.md +0 -6

README.md CHANGED Viewed

@@ -125,12 +125,6 @@ Supervised speech instruction finetuning via knowledge-distillation. For more in
 - **Training regime:** BF16 mixed precision training
 - **Hardward used:** 8x H100 GPUs
-#### Speeds, Sizes, Times
-The current version of Ultravox, when invoked with audio content, has a time-to-first-token (TTFT) of approximately 150ms, and a tokens-per-second rate of ~50-100 when using an A100-40GB GPU, all using a Llama 3.3 70B backbone.
-Check out the audio tab on [TheFastest.ai](https://thefastest.ai/?m=audio) for daily benchmarks and a comparison with other existing models.
 ## Evaluation
 |  | Ultravox 0.4 70B | Ultravox 0.4.1 70B | **Ultravox 0.5 70B** |

 - **Training regime:** BF16 mixed precision training
 - **Hardward used:** 8x H100 GPUs
 ## Evaluation
 |  | Ultravox 0.4 70B | Ultravox 0.4.1 70B | **Ultravox 0.5 70B** |