InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14 • 306
MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding Paper • 2507.12463 • Published Jul 16 • 26
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents Paper • 2507.04009 • Published Jul 5 • 51
SpeedSearcher-aicrowd/Llama-3.2-11B-Vision-Instruct-direct-v1 Image-to-Text • 11B • Updated Jun 12
SpeedSearcher-aicrowd/Llama-3.2-11B-Vision-Instruct-direct-v1 Image-to-Text • 11B • Updated Jun 12