ConsistEdit: Highly Consistent and Precise Training-free Visual Editing Paper • 2510.17803 • Published 11 days ago • 12
LazyDrag: Enabling Stable Drag-Based Editing on Multi-Modal Diffusion Transformers via Explicit Correspondence Paper • 2509.12203 • Published Sep 15 • 19
Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis Paper • 2211.14506 • Published Nov 26, 2022 • 1
UniVerse-1: Unified Audio-Video Generation via Stitching of Experts Paper • 2509.06155 • Published Sep 7 • 13
RecA Collection Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Learning! • 8 items • Updated Sep 22 • 12
Learnable SMPLify: A Neural Solution for Optimization-Free Human Pose Inverse Kinematics Paper • 2508.13562 • Published Aug 19 • 4
Motion2Motion: Cross-topology Motion Transfer with Sparse Correspondence Paper • 2508.13139 • Published Aug 18 • 4
Training-Free Text-Guided Color Editing with Multi-Modal Diffusion Transformer Paper • 2508.09131 • Published Aug 12 • 16
SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual Dyadic Interactive Human Generation Paper • 2507.09862 • Published Jul 14 • 49
MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model Paper • 2404.19759 • Published Apr 30, 2024 • 27