Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight Paper • 2511.16175 • Published Nov 20, 2025 • 12 • 2
VLABench: A Large-Scale Benchmark for Language-Conditioned Robotics Manipulation with Long-Horizon Reasoning Tasks Paper • 2412.18194 • Published Dec 24, 2024 • 1 • 2
LoHoVLA: A Unified Vision-Language-Action Model for Long-Horizon Embodied Tasks Paper • 2506.00411 • Published May 31, 2025 • 31 • 3
WorldVLA: Towards Autoregressive Action World Model Paper • 2506.21539 • Published Jun 26, 2025 • 40 • 3