badrex
/

w2v-bert-2.0-wolaytta-asr

Automatic Speech Recognition

Model card Files Files and versions

w2v-bert-2.0-wolaytta-asr / README.md

badrex's picture

Update README.md

2661d9d verified 10 days ago

|

history blame contribute delete

2.68 kB

	---
	library_name: transformers
	license: cc-by-4.0
	datasets:
	- badrex/ethiopian-speech-flat
	---

	<div align="center" style="line-height: 1;">
	<h1>Automatic Speech Recognition for Wolaytta 🇪🇹</h1>
	<a href="https://huggingface.co/datasets/badrex/ethiopian-speech-flat" target="_blank" style="margin: 2px;">
	<img alt="Hugging Face" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Dataset-ffc107?color=ffca28&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
	<a href="https://huggingface.co/spaces/badrex/Ethiopia-ASR" target="_blank" style="margin: 2px;">
	<img alt="Hugging Face" src="https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Space-ffc107?color=c62828&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
	<a href="https://creativecommons.org/licenses/by/4.0/deed.en" style="margin: 2px;">
	<img alt="License" src="https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg" style="display: inline-block; vertical-align: middle;"/>
	</a>
	</div>



	## 🍇 Model Description

	This is a Automatic Speech Recognition (ASR) model for Wolaytta, one of the official languages of Ethiopia.
	It is fine‑tuned from Wav2Vec2‑BERT 2.0 using the [Ethio speech corpus](https://huggingface.co/datasets/badrex/ethiopian-speech-flat).

	- Developed by: Badr al-Absi
	- Model type: Speech Recognition (ASR)
	- Languages: Wolaytta
	- License: CC-BY-4.0
	- Finetuned from: facebook/w2v-bert-2.0

	## 🎧 Direct Use

	``` python
	from transformers import Wav2Vec2BertProcessor, Wav2Vec2BertForCTC
	import torchaudio, torch

	processor = Wav2Vec2BertProcessor.from_pretrained("badrex/w2v-bert-2.0-wolaytta-asr")
	model = Wav2Vec2BertForCTC.from_pretrained("badrex/w2v-bert-2.0-wolaytta-asr")

	audio, sr = torchaudio.load("audio.wav")
	inputs = processor(audio.squeeze(), sampling_rate=sr, return_tensors="pt")

	with torch.no_grad():
	logits = model(**inputs).logits

	pred_ids = torch.argmax(logits, dim=-1)
	transcription = processor.batch_decode(pred_ids)[0]

	print(transcription)
	```

	## 🔧 Downstream Use

	- Voice assistants
	- Accessibility tools
	- Research baselines

	## 🚫 Out‑of‑Scope Use

	- Other languages besides Wolaytta
	- High‑stakes deployments without human review
	- Noisy audio without further tuning

	## ⚠️ Risks & Limitations

	Performance varies with accents, dialects, and recording quality.



	## 📌 Citation

	``` bibtex
	@misc{w2v_bert_ethiopian_asr,
	author = {Badr M. Abdullah},
	title = {Fine-tuning Wav2Vec2-BERT 2.0 for Ethiopian ASR},
	year = {2025},
	url = {https://huggingface.co/badrex/w2v-bert-2.0-wolaytta-asr}
	}
	```