SmolVLM: Redefining small and efficient multimodal models Paper β’ 2504.05299 β’ Published Apr 7, 2025 β’ 202
Vision Language Models Quantization Collection Vision Language Models (VLMs) quantized by Neural Magic β’ 20 items β’ Updated Mar 4, 2025 β’ 6
MambaVision Collection MambaVision: A Hybrid Mamba-Transformer Vision Backbone. Includes both 1K and 21K pretrained models. β’ 13 items β’ Updated 10 days ago β’ 34
MoshiVis v0.1 Collection MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs β’ 9 items β’ Updated 10 days ago β’ 23
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 Mar 12, 2025 β’ 480