Playing with Transformer at 30+ FPS via Next-Frame Diffusion Paper • 2506.01380 • Published Jun 2 • 2
LiveVLM: Efficient Online Video Understanding via Streaming-Oriented KV Cache and Retrieval Paper • 2505.15269 • Published May 21 • 1
Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation Paper • 2505.22647 • Published May 28 • 3
TalkingMachines: Real-Time Audio-Driven FaceTime-Style Video via Autoregressive Diffusion Models Paper • 2506.03099 • Published Jun 3 • 19