Video-Text-to-Text
Transformers
Safetensors
English
videochat_flash_qwen
feature-extraction
multimodal
custom_code
Eval Results (legacy)
Instructions to use OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448 with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -114,7 +114,7 @@ from transformers import AutoModel, AutoTokenizer
|
|
| 114 |
model_path = 'OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448'
|
| 115 |
|
| 116 |
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
|
| 117 |
-
model = AutoModel.from_pretrained(model_path, trust_remote_code=True).
|
| 118 |
image_processor = model.get_vision_tower().image_processor
|
| 119 |
|
| 120 |
mm_llm_compress = False # use the global compress or not
|
|
|
|
| 114 |
model_path = 'OpenGVLab/VideoChat-Flash-Qwen2_5-2B_res448'
|
| 115 |
|
| 116 |
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
|
| 117 |
+
model = AutoModel.from_pretrained(model_path, trust_remote_code=True).to(torch.bfloat16).cuda()
|
| 118 |
image_processor = model.get_vision_tower().image_processor
|
| 119 |
|
| 120 |
mm_llm_compress = False # use the global compress or not
|