I Dream My Painting: Connecting MLLMs and Diffusion Models via Prompt Generation for Text-Guided Multi-Mask Inpainting Paper • 2411.19050 • Published Nov 28, 2024
Label Anything: Multi-Class Few-Shot Semantic Segmentation with Visual Prompts Paper • 2407.02075 • Published Jul 2, 2024
ArtSeek: Deep artwork understanding via multimodal in-context reasoning and late interaction retrieval Paper • 2507.21917 • Published Jul 29