File size: 2,678 Bytes
81bfad1
 
6b56d57
 
 
81bfad1
 
2661d9d
 
 
 
 
 
 
 
 
 
81bfad1
 
 
2661d9d
81bfad1
2661d9d
 
81bfad1
2661d9d
 
 
 
 
81bfad1
2661d9d
81bfad1
2661d9d
 
 
81bfad1
2661d9d
 
81bfad1
2661d9d
 
81bfad1
2661d9d
 
81bfad1
2661d9d
 
81bfad1
2661d9d
 
81bfad1
2661d9d
81bfad1
2661d9d
 
 
81bfad1
2661d9d
81bfad1
2661d9d
 
 
81bfad1
2661d9d
81bfad1
2661d9d
81bfad1
 
 
2661d9d
81bfad1
2661d9d
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
library_name: transformers
license: cc-by-4.0
datasets:
- badrex/ethiopian-speech-flat
---

<div align="center" style="line-height: 1;">
  <h1>Automatic Speech Recognition for Wolaytta 🇪🇹</h1>
  <a href="https://huggingface.co/datasets/badrex/ethiopian-speech-flat" target="_blank" style="margin: 2px;">
    <img alt="Hugging Face" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Dataset-ffc107?color=ffca28&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
  <a href="https://huggingface.co/spaces/badrex/Ethiopia-ASR" target="_blank" style="margin: 2px;">
    <img alt="Hugging Face" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Space-ffc107?color=c62828&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
  <a href="https://creativecommons.org/licenses/by/4.0/deed.en" style="margin: 2px;">
    <img alt="License" src="https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg" style="display: inline-block; vertical-align: middle;"/>
  </a>
</div>



## 🍇 Model Description 

This is a Automatic Speech Recognition (ASR) model for Wolaytta, one of the official languages of Ethiopia. 
It is fine‑tuned from Wav2Vec2‑BERT 2.0 using the [Ethio speech corpus](https://huggingface.co/datasets/badrex/ethiopian-speech-flat).

- **Developed by:** Badr al-Absi
- **Model type:** Speech Recognition (ASR)
- **Languages:** Wolaytta
- **License:** CC-BY-4.0
- **Finetuned from:** facebook/w2v-bert-2.0

## 🎧 Direct Use

``` python
from transformers import Wav2Vec2BertProcessor, Wav2Vec2BertForCTC
import torchaudio, torch

processor = Wav2Vec2BertProcessor.from_pretrained("badrex/w2v-bert-2.0-wolaytta-asr")
model = Wav2Vec2BertForCTC.from_pretrained("badrex/w2v-bert-2.0-wolaytta-asr")

audio, sr = torchaudio.load("audio.wav")
inputs = processor(audio.squeeze(), sampling_rate=sr, return_tensors="pt")

with torch.no_grad():
    logits = model(**inputs).logits

pred_ids = torch.argmax(logits, dim=-1)
transcription = processor.batch_decode(pred_ids)[0]

print(transcription)
```

## 🔧 Downstream Use

-   Voice assistants
-   Accessibility tools
-   Research baselines

## 🚫 Out‑of‑Scope Use

-   Other languages besides Wolaytta
-   High‑stakes deployments without human review
-   Noisy audio without further tuning

## ⚠️ Risks & Limitations

Performance varies with accents, dialects, and recording quality.



## 📌 Citation

``` bibtex
@misc{w2v_bert_ethiopian_asr,
  author = {Badr M. Abdullah},
  title = {Fine-tuning Wav2Vec2-BERT 2.0 for Ethiopian ASR},
  year = {2025},
  url = {https://huggingface.co/badrex/w2v-bert-2.0-wolaytta-asr}
}
```