VisR-Bench: An Empirical Study on Visual Retrieval-Augmented Generation for Multilingual Long Document Understanding Paper • 2508.07493 • Published Aug 10 • 8 • 2
MusiXQA: Advancing Visual Music Understanding in Multimodal Large Language Models Paper • 2506.23009 • Published Jun 28 • 10 • 1