gryannote

Runtime error

App Files Files Community

gryannote / README.md

ahmad walidurosyad

Upgrade Gradio to 4.44.1 to fix API schema generation error

a334e75 18 days ago

preview code

raw

history blame contribute delete

1.59 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

metadata

title: DiariZen Speaker Diarization
emoji: 🎙️
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.1
app_file: app.py
suggested_hardware: t4-small
pinned: false
license: mit

🎙️ DiariZen Speaker Diarization

High-performance speaker diarization using DiariZen from BUT-FIT.

Features

3 Models Available: WavLM Large (recommended), WavLM Base (faster), WavLM Large MLC (multilingual)
Simple Interface: Upload audio → Select model → Run → Download RTTM
High Performance: Substantially outperforms Pyannote v3.1
GPU Accelerated: Uses Hugging Face Spaces GPU

Performance

DiariZen achieves state-of-the-art results:

AMI-SDM: 13.9% DER (vs 22.4% Pyannote v3.1)
VoxConverse: 9.1% DER (vs 11.3% Pyannote v3.1)
AISHELL-4: 10.1% DER (vs 12.2% Pyannote v3.1)

Usage

Upload audio file or record
Select diarization model
Click "Run Diarization"
View results and download RTTM file

Technical Details

This Space uses a custom Dockerfile to install DiariZen with all its dependencies:

PyTorch 2.1.1 with CUDA 12.1
DiariZen toolkit with git submodules
Bundled pyannote-audio (custom version)
FFmpeg for audio processing

Citation

@inproceedings{diariZen2024,
  title={DiariZen: A toolkit for speaker diarization},
  author={Han, Ivo and Landini, Federico and Burget, Lukáš and Černocký, Jan},
  booktitle={INTERSPEECH},
  year={2024}
}

Source

DiariZen: https://github.com/BUTSpeechFIT/DiariZen
License: MIT (Code) | Research/Non-commercial (Models)