Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models Paper • 2412.14058 • Published Dec 18, 2024 • 1
MM-RLHF: The Next Step Forward in Multimodal LLM Alignment Paper • 2502.10391 • Published Feb 14 • 34
BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models Paper • 2506.07961 • Published Jun 9 • 11