ColQwen3-vlembed-base: Visual Retriever built by merging Qwen3-VL-2B-Instruct with Qwen3-VL-Embedding-2B through ColBERT strategy

🙏🙏🙏 Why always 2B ??

Due to my limited computing resources, I can currently only conduct some interesting experiments with 2B/4B models.😝😝😝

Usage

This version should not be used: it is solely the base version useful for deterministic LoRA initialization.

This model is built by merging Qwen/Qwen3-VL-2B-Instruct with Qwen3-VL-Embedding-2B

Contact

Mungeryang: [email protected]/[email protected]

Acknowledgments

❤️❤️❤️

Thanks to the Colpali team and Qwen team for their excellent open-source works! I accomplished this work by standing on the shoulders of giants~

Citation

If you use any datasets or models from this organization in your research, please cite the original dataset as follows:

@misc{faysse2024colpaliefficientdocumentretrieval,
  title={ColPali: Efficient Document Retrieval with Vision Language Models}, 
  author={Manuel Faysse and Hugues Sibille and Tony Wu and Bilel Omrani and Gautier Viaud and Céline Hudelot and Pierre Colombo},
  year={2024},
  eprint={2407.01449},
  archivePrefix={arXiv},
  primaryClass={cs.IR},
  url={https://arxiv.org/abs/2407.01449}, 
}

Downloads last month: -

Safetensors

Model size

2B params

Tensor type

BF16

Inference Providers NEW

Visual Document Retrieval

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for goodman2001/colqwen3-vlembed-base

Base model

Qwen/Qwen3-VL-2B-Instruct

Finetuned

(112)

this model

Paper for goodman2001/colqwen3-vlembed-base

ColPali: Efficient Document Retrieval with Vision Language Models

Paper • 2407.01449 • Published Jun 27, 2024 • 51