D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI Paper • 2510.05684 • Published Oct 7 • 139
Seeing Voices: Generating A-Roll Video from Audio with Mirage Paper • 2506.08279 • Published Jun 9 • 27
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 646
Gemma release Collection Groups the Gemma models released by the Google team. • 40 items • Updated Jul 10 • 345