The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding Paper • 2512.19693 • Published 3 days ago • 60
Veila: Panoramic LiDAR Generation from a Monocular RGB Image Paper • 2508.03690 • Published Aug 5
LongVie 2: Multimodal Controllable Ultra-Long Video World Model Paper • 2512.13604 • Published 10 days ago • 70
OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and Manipulation Paper • 2512.08294 • Published 17 days ago • 17
CoS: Chain-of-Shot Prompting for Long Video Understanding Paper • 2502.06428 • Published Feb 10 • 10
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model Paper • 2507.01953 • Published Jul 2 • 18
LongVie: Multimodal-Guided Controllable Ultra-Long Video Generation Paper • 2508.03694 • Published Aug 5 • 51
SpineBench: A Clinically Salient, Level-Aware Benchmark Powered by the SpineMed-450k Corpus Paper • 2510.03160 • Published Oct 3 • 4
DiverseAR: Boosting Diversity in Bitwise Autoregressive Image Generation Paper • 2512.02931 • Published 23 days ago
LongVie 2: Multimodal Controllable Ultra-Long Video World Model Paper • 2512.13604 • Published 10 days ago • 70
LongVie 2: Multimodal Controllable Ultra-Long Video World Model Paper • 2512.13604 • Published 10 days ago • 70
EditThinker: Unlocking Iterative Reasoning for Any Image Editor Paper • 2512.05965 • Published 20 days ago • 38