--- license: apache-2.0 datasets: - lmms-lab/LLaVA-Video-178K --- [VideoRoPE: What Makes for Good Video Rotary Position Embedding?](https://arxiv.org/pdf/2502.05173) **Trained model:** Qwen2VL Vision Tower + Qwen2 Language Model **RoPE type:** Vanilla RoPE To use this model, simply set `which_type='vanilla_rope'` and `scale_factor=1.0`. For more details, please refer to the [code implementation](https://github.com/Wiselnn570/VideoRoPE).