Automatic Speech Recognition
pyannote.audio
pyannote
pyannote-audio-pipeline
audio
voice
speech
speaker
speaker-diarization
speaker-change-detection
voice-activity-detection
overlapped-speech-detection
Instructions to use G-Root/speaker-diarization-optimized with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- pyannote.audio
How to use G-Root/speaker-diarization-optimized with pyannote.audio:
from pyannote.audio import Pipeline pipeline = Pipeline.from_pretrained("G-Root/speaker-diarization-optimized") # inference on the whole file pipeline("file.wav") # inference on an excerpt from pyannote.core import Segment excerpt = Segment(start=2.0, end=5.0) from pyannote.audio import Audio waveform, sample_rate = Audio().crop("file.wav", excerpt) pipeline({"waveform": waveform, "sample_rate": sample_rate}) - Notebooks
- Google Colab
- Kaggle
| version: 3.1.0 | |
| pipeline: | |
| name: pyannote.audio.pipelines.SpeakerDiarization | |
| params: | |
| clustering: AgglomerativeClustering | |
| embedding: hbredin/wespeaker-voxceleb-resnet34-LM | |
| embedding_batch_size: 32 | |
| embedding_exclude_overlap: true | |
| segmentation: pyannote/segmentation-3.0 | |
| segmentation_batch_size: 32 | |
| segmentation_step: 0.5 | |
| params: | |
| clustering: | |
| method: centroid | |
| min_cluster_size: 12 | |
| threshold: 0.7045654963945799 | |
| segmentation: | |
| min_duration_off: 0.0 | |