Video-As-Prompt: Unified Semantic Control for Video Generation Paper • 2510.20888 • Published 5 days ago • 41
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning Paper • 2510.19338 • Published 7 days ago • 98
Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1 Paper • 2510.19600 • Published 7 days ago • 65
Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence Paper • 2510.20579 • Published 6 days ago • 50
LightMem: Lightweight and Efficient Memory-Augmented Generation Paper • 2510.18866 • Published 7 days ago • 105
QueST: Incentivizing LLMs to Generate Difficult Problems Paper • 2510.17715 • Published 8 days ago • 31
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Paper • 2510.15870 • Published 11 days ago • 80
WithAnyone: Towards Controllable and ID Consistent Image Generation Paper • 2510.14975 • Published 12 days ago • 79
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA Paper • 2510.04849 • Published 23 days ago • 108
The Art of Scaling Reinforcement Learning Compute for LLMs Paper • 2510.13786 • Published 13 days ago • 30
UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE Paper • 2510.13344 • Published 14 days ago • 61
OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs Paper • 2510.10689 • Published 16 days ago • 46
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published 15 days ago • 160
UniVideo: Unified Understanding, Generation, and Editing for Videos Paper • 2510.08377 • Published 20 days ago • 67
VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning Paper • 2510.08555 • Published 19 days ago • 62
Cache-to-Cache: Direct Semantic Communication Between Large Language Models Paper • 2510.03215 • Published 25 days ago • 93