Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published 14 days ago • 159
InstructX: Towards Unified Visual Editing with MLLM Guidance Paper • 2510.08485 • Published 18 days ago • 16
OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models Paper • 2509.17627 • Published Sep 22 • 65
MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing Paper • 2505.02823 • Published May 5 • 5
MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing Paper • 2505.02823 • Published May 5 • 5 • 1
Have we unified image generation and understanding yet? An empirical study of GPT-4o's image generation ability Paper • 2504.08003 • Published Apr 9 • 49
When Less is Enough: Adaptive Token Reduction for Efficient Image Representation Paper • 2503.16660 • Published Mar 20 • 72