Robustness in Both Domains: CLIP Needs a Robust Text Encoder Paper • 2506.03355 • Published Jun 3 • 6
FuseLIP: Multimodal Embeddings via Early Fusion of Discrete Tokens Paper • 2506.03096 • Published Jun 3 • 4
DASH: Detection and Assessment of Systematic Hallucinations of VLMs Paper • 2503.23573 • Published Mar 30 • 12