--- library_name: transformers license: cc-by-4.0 datasets: - badrex/ethiopian-speech-flat language: - ti metrics: - wer ---

Automatic Speech Recognition for Tigrinya

## 🍇 Model Description This is a Automatic Speech Recognition (ASR) model for Tigrinya, an Afroasiatic language that is primarily spoken by the Tigrinya and Tigrayan peoples, native to Eritrea and to the Tigray Region of Ethiopia. It is fine‑tuned from Wav2Vec2‑BERT 2.0 using the [Ethio speech corpus](https://huggingface.co/datasets/badrex/ethiopian-speech-flat). - **Developed by:** Badr al-Absi - **Model type:** Speech Recognition (ASR) - **Languages:** Tigrinya - **License:** CC-BY-4.0 - **Finetuned from:** facebook/w2v-bert-2.0 ## 🎧 Direct Use ``` python from transformers import Wav2Vec2BertProcessor, Wav2Vec2BertForCTC import torchaudio, torch processor = Wav2Vec2BertProcessor.from_pretrained("badrex/w2v-bert-2.0-tigrinya-asr") model = Wav2Vec2BertForCTC.from_pretrained("badrex/w2v-bert-2.0-tigrinya-asr") audio, sr = torchaudio.load("audio.wav") inputs = processor(audio.squeeze(), sampling_rate=sr, return_tensors="pt") with torch.no_grad(): logits = model(**inputs).logits pred_ids = torch.argmax(logits, dim=-1) transcription = processor.batch_decode(pred_ids)[0] print(transcription) ``` ## 🔧 Downstream Use - Voice assistants - Accessibility tools - Research baselines ## 🚫 Out‑of‑Scope Use - Other languages besides Tigrinya - High‑stakes deployments without human review - Noisy audio without further tuning ## ⚠️ Risks & Limitations Performance varies with accents, dialects, and recording quality. ## 📌 Citation ``` bibtex @misc{w2v_bert_ethiopian_asr, author = {Badr M. Abdullah}, title = {Fine-tuning Wav2Vec2-BERT 2.0 for Ethiopian ASR}, year = {2025}, url = {https://huggingface.co/badrex/w2v-bert-2.0-tigrinya-asr} } ```