Idefics3
Generate text based on an image and prompt
Generate text based on an image and prompt
Media understanding
Identify objects in images using text queries
Generate text and segment images using PaliGemma
Annotate and describe images with text prompts
Segment and caption objects in images and videos
Analyze images to caption, detect objects, and extract text
Generate detailed image analyses and depth predictions
Generate detailed descriptions from images and questions
Generate descriptions and answers about images
Interact with a multimodal chatbot that analyzes images and text
Generate captions and analyze images with various tasks
Generate text from an image and question
Chat with Pixtral 12B using Mistral Inference
Interact with a chatbot that understands text and images
State-of-the-art Zero-shot Object Detection
Generate text by uploading images and asking questions
Generate text from images and queries
Generate text responses based on images and chat history
Paligemma2 Detection with Supervision
Generate text responses from images and text input
Visualize image depth, segmentation, and generation
A unified multimodal understanding and generation model.