VPT Models
					Collection
				
Qwen2-VL Models with Visual Perception Token or used in training process.
					• 
				7 items
				• 
				Updated
					
				
This repository contains models based on the paper Introducing Visual Perception Token into Multimodal Large Language Model. These models utilize Visual Perception Tokens to enhance the visual perception capabilities of multimodal large language models (MLLMs).