Latent Diffusion Model without Variational Autoencoder Paper • 2510.15301 • Published 13 days ago • 47
D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI Paper • 2510.05684 • Published 23 days ago • 133
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation Paper • 2510.08673 • Published 21 days ago • 120
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published 24 days ago • 455
Seedream 4.0: Toward Next-generation Multimodal Image Generation Paper • 2509.20427 • Published Sep 24 • 76
OnePiece: Bringing Context Engineering and Reasoning to Industrial Cascade Ranking System Paper • 2509.18091 • Published Sep 22 • 33
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning Paper • 2509.02544 • Published Sep 2 • 123
Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing Paper • 2509.08721 • Published Sep 10 • 673
Wan-Animate: Unified Character Animation and Replacement with Holistic Replication Paper • 2509.14055 • Published Sep 17 • 14
Hunyuan3D Studio: End-to-End AI Pipeline for Game-Ready 3D Asset Generation Paper • 2509.12815 • Published Sep 16 • 38
ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization Paper • 2509.13313 • Published Sep 16 • 77
WebResearcher: Unleashing unbounded reasoning capability in Long-Horizon Agents Paper • 2509.13309 • Published Sep 16 • 67
Towards General Agentic Intelligence via Environment Scaling Paper • 2509.13311 • Published Sep 16 • 70
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning Paper • 2509.08755 • Published Sep 10 • 56