Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm Paper • 2511.04570 • Published 19 days ago • 198
Ovi: Twin Backbone Cross-Modal Fusion for Audio-Video Generation Paper • 2510.01284 • Published Sep 30 • 32
Taming Text-to-Sounding Video Generation via Advanced Modality Condition and Interaction Paper • 2510.03117 • Published Oct 3 • 11
Taming Text-to-Sounding Video Generation via Advanced Modality Condition and Interaction Paper • 2510.03117 • Published Oct 3 • 11
Taming Text-to-Sounding Video Generation via Advanced Modality Condition and Interaction Paper • 2510.03117 • Published Oct 3 • 11 • 2