AffordanceVLA: A Vision-Language-Action Model Empowering Action Generation through Affordance-Aware Understanding
Paper • 2606.06155 • Published • 8
nlu
AffordanceVLA: A Vision-Language-Action Model Empowering Action Generation through Affordance-Aware Understanding
LoomVideo: Unifying Multimodal Inputs into Video Generation and Editing