WithAnyone: Towards Controllable and ID Consistent Image Generation Paper • 2510.14975 • Published 18 days ago • 80
VISTA: A Test-Time Self-Improving Video Generation Agent Paper • 2510.15831 • Published 17 days ago • 20
Latent Diffusion Model without Variational Autoencoder Paper • 2510.15301 • Published 18 days ago • 48
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Paper • 2510.15870 • Published 17 days ago • 86
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks Paper • 2510.15019 • Published 18 days ago • 62
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published 21 days ago • 160
TAG:Tangential Amplifying Guidance for Hallucination-Resistant Diffusion Sampling Paper • 2510.04533 • Published 29 days ago • 47
Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs Paper • 2510.09201 • Published 25 days ago • 47
Reinforcing Diffusion Models by Direct Group Preference Optimization Paper • 2510.08425 • Published 25 days ago • 11
Self-Forcing++: Towards Minute-Scale High-Quality Video Generation Paper • 2510.02283 • Published Oct 2 • 91
ACON: Optimizing Context Compression for Long-horizon LLM Agents Paper • 2510.00615 • Published Oct 1 • 31
OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models Paper • 2509.17627 • Published Sep 22 • 65
A Vision-Language-Action-Critic Model for Robotic Real-World Reinforcement Learning Paper • 2509.15937 • Published Sep 19 • 20
Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification Paper • 2509.15591 • Published Sep 19 • 45
Kling-Avatar: Grounding Multimodal Instructions for Cascaded Long-Duration Avatar Animation Synthesis Paper • 2509.09595 • Published Sep 11 • 48
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning Paper • 2509.08519 • Published Sep 10 • 126
TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis Paper • 2508.13618 • Published Aug 19 • 17