File size: 2,678 Bytes
81bfad1 6b56d57 81bfad1 2661d9d 81bfad1 2661d9d 81bfad1 2661d9d 81bfad1 2661d9d 81bfad1 2661d9d 81bfad1 2661d9d 81bfad1 2661d9d 81bfad1 2661d9d 81bfad1 2661d9d 81bfad1 2661d9d 81bfad1 2661d9d 81bfad1 2661d9d 81bfad1 2661d9d 81bfad1 2661d9d 81bfad1 2661d9d 81bfad1 2661d9d 81bfad1 2661d9d 81bfad1 2661d9d 81bfad1 2661d9d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
---
library_name: transformers
license: cc-by-4.0
datasets:
- badrex/ethiopian-speech-flat
---
<div align="center" style="line-height: 1;">
<h1>Automatic Speech Recognition for Wolaytta 🇪🇹</h1>
<a href="https://huggingface.co/datasets/badrex/ethiopian-speech-flat" target="_blank" style="margin: 2px;">
<img alt="Hugging Face" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Dataset-ffc107?color=ffca28&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
<a href="https://huggingface.co/spaces/badrex/Ethiopia-ASR" target="_blank" style="margin: 2px;">
<img alt="Hugging Face" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Space-ffc107?color=c62828&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
<a href="https://creativecommons.org/licenses/by/4.0/deed.en" style="margin: 2px;">
<img alt="License" src="https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg" style="display: inline-block; vertical-align: middle;"/>
</a>
</div>
## 🍇 Model Description
This is a Automatic Speech Recognition (ASR) model for Wolaytta, one of the official languages of Ethiopia.
It is fine‑tuned from Wav2Vec2‑BERT 2.0 using the [Ethio speech corpus](https://huggingface.co/datasets/badrex/ethiopian-speech-flat).
- **Developed by:** Badr al-Absi
- **Model type:** Speech Recognition (ASR)
- **Languages:** Wolaytta
- **License:** CC-BY-4.0
- **Finetuned from:** facebook/w2v-bert-2.0
## 🎧 Direct Use
``` python
from transformers import Wav2Vec2BertProcessor, Wav2Vec2BertForCTC
import torchaudio, torch
processor = Wav2Vec2BertProcessor.from_pretrained("badrex/w2v-bert-2.0-wolaytta-asr")
model = Wav2Vec2BertForCTC.from_pretrained("badrex/w2v-bert-2.0-wolaytta-asr")
audio, sr = torchaudio.load("audio.wav")
inputs = processor(audio.squeeze(), sampling_rate=sr, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
pred_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(pred_ids)[0]
print(transcription)
```
## 🔧 Downstream Use
- Voice assistants
- Accessibility tools
- Research baselines
## 🚫 Out‑of‑Scope Use
- Other languages besides Wolaytta
- High‑stakes deployments without human review
- Noisy audio without further tuning
## ⚠️ Risks & Limitations
Performance varies with accents, dialects, and recording quality.
## 📌 Citation
``` bibtex
@misc{w2v_bert_ethiopian_asr,
author = {Badr M. Abdullah},
title = {Fine-tuning Wav2Vec2-BERT 2.0 for Ethiopian ASR},
year = {2025},
url = {https://huggingface.co/badrex/w2v-bert-2.0-wolaytta-asr}
}
``` |