Spaces:
Running
Running
| <html> | |
| <head> | |
| <meta charset="utf-8" /> | |
| <meta name="viewport" content="width=device-width" /> | |
| <title>My static Space</title> | |
| <link rel="stylesheet" href="style.css" /> | |
| </head> | |
| <body> | |
| <div class="card"> | |
| <h1>Hugging Face for Audio: Resources ✨</h1> | |
| <br> | |
| <br> | |
| <b>Audio transformers course</b>: https://huggingface.co/learn/audio-course/chapter0/introduction#course-structure. This covers the standard tasks (ASR, TTS, audio classification) with notes on using pre-trained models and fine-tuning. See also Unit 7 for a speaker diarization application. | |
| <br> | |
| <br> | |
| <h2>Using pre-trained models</h2> | |
| <ul> | |
| <li>With pipelines: https://www.reddit.com/r/MachineLearning/comments/16xshji/d_the_most_complete_audio_ml_toolkit/</li> | |
| <li>Transformers docs: https://huggingface.co/docs/transformers/model_doc/audio-spectrogram-transformer</li> | |
| </ul> | |
| <br> | |
| <br> | |
| <h2>Training</h2> | |
| <ul> | |
| <li>Datasets https://huggingface.co/blog/audio-datasets</li> | |
| <li>Fine-tune Whisper for ASR https://huggingface.co/blog/fine-tune-whisper</li> | |
| <li>Distil Whisper for ASR https://github.com/huggingface/distil-whisper/tree/main/training</li> | |
| <li>Fine-tune VITS for TTS https://twitter.com/yoachlacombe/status/1735348885369889264</li> | |
| <li>Fine-tune Wav2Vec2 for audio class https://github.com/huggingface/transformers/tree/main/examples/pytorch/audio-classification</li> | |
| </ul> | |
| <br> | |
| <br> | |
| <h2>Optimisation</h2> | |
| <ul> | |
| <li>Whisper JAX for ASR https://github.com/sanchit-gandhi/whisper-jax</li> | |
| <li>Distil Whisper for ASR https://github.com/huggingface/distil-whisper/tree/main</li> | |
| <li>Insanely Fast Whisper for ASR https://github.com/Vaibhavs10/insanely-fast-whisper</li> | |
| <li>Speculative decoding with Whisper for ASR https://huggingface.co/blog/whisper-speculative-decoding</li> | |
| <li>Bark for TTS https://huggingface.co/blog/optimizing-bark</li> | |
| </ul> | |
| <br> | |
| <br> | |
| <h2>Deployment</h2> | |
| <ul> | |
| <li>Endpoint https://huggingface.co/blog/run-musicgen-as-an-api</li> | |
| <li>Gradio client https://www.gradio.app/docs/client (e.g. for Whisper https://huggingface.co/spaces/hf-audio/whisper-large-v3)</li> | |
| </ul> | |
| </div> | |
| </body> | |
| </html> | |