Sa2VA-i: Improving Sa2VA Results with Consistent Training and Inference
Acknowledgement
We thank Sa2VA authors for their contribution.
@article{sa2va,
  title={Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos},
  author={Yuan, Haobo and Li, Xiangtai and Zhang, Tao and Huang, Zilong Huang and Xu, Shilin and Ji, Shunping and Tong, Yunhai and Qi, Lu and Feng, Jiashi and Yang, Ming-Hsuan},
  journal={arXiv preprint},
  year={2025}
}
- Downloads last month
- 2
Model tree for kumuji/Sa2VA-i-4B
Merge model
	
	
this model
	
							