metadata
base_model:
- Qwen/Qwen3-VL-8B-Instruct
datasets:
- xashru/sphinx
license: apache-2.0
pipeline_tag: image-text-to-text
library_name: transformers
This model is released alongside the paper SPHINX: A Synthetic Environment for Visual Perception and Reasoning. It is trained on the SPHINX training split using Verl with GRPO.
For code and more details, see the GitHub repository.