Pushing the Frontier of Audiovisual Perception with Large-Scale Multimodal Correspondence Learning Paper • 2512.19687 • Published 11 days ago • 1
Meta CLIP 1 Collection Scaling CLIP data with transparent training distribution from an end-to-end pipeline. • 7 items • Updated Nov 24, 2025 • 21
USAD: Universal Speech and Audio Representation via Distillation Paper • 2506.18843 • Published Jun 23, 2025 • 12 • 1
DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT Paper • 2110.01900 • Published Oct 5, 2021
SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model Paper • 2210.00705 • Published Oct 3, 2022
USAD: Universal Speech and Audio Representation via Distillation Paper • 2506.18843 • Published Jun 23, 2025 • 12
USAD: Universal Speech and Audio Representation via Distillation Paper • 2506.18843 • Published Jun 23, 2025 • 12
USAD models Collection USAD: Universal Speech and Audio Representation via Distillation • 4 items • Updated Jun 24, 2025 • 1