Self-Evaluation Unlocks Any-Step Text-to-Image Generation Paper • 2512.22374 • Published 7 days ago • 14
MC#: Mixture Compressor for Mixture-of-Experts Large Models Paper • 2510.10962 • Published Oct 13, 2025
Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models Paper • 2512.20557 • Published 10 days ago • 48
Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models Paper • 2512.20557 • Published 10 days ago • 48
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13, 2025 • 176
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Paper • 2510.15870 • Published Oct 17, 2025 • 89
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO Paper • 2505.13031 • Published May 19, 2025 • 4
EmbRACE-3K: Embodied Reasoning and Action in Complex Environments Paper • 2507.10548 • Published Jul 14, 2025 • 36
LongLive: Real-time Interactive Long Video Generation Paper • 2509.22622 • Published Sep 26, 2025 • 184
Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation Paper • 2507.08441 • Published Jul 11, 2025 • 61
EmbRACE-3K: Embodied Reasoning and Action in Complex Environments Paper • 2507.10548 • Published Jul 14, 2025 • 36
UniTok: A Unified Tokenizer for Visual Generation and Understanding Paper • 2502.20321 • Published Feb 27, 2025 • 30
TEXGen: a Generative Diffusion Model for Mesh Textures Paper • 2411.14740 • Published Nov 22, 2024 • 17
Image Inpainting via Iteratively Decoupled Probabilistic Modeling Paper • 2212.02963 • Published Dec 6, 2022
Is synthetic data from generative models ready for image recognition? Paper • 2210.07574 • Published Oct 14, 2022
Towards Efficient and Scale-Robust Ultra-High-Definition Image Demoireing Paper • 2207.09935 • Published Jul 20, 2022