RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning Paper • 2510.02240 • Published Oct 2 • 17
MANI-Pure: Magnitude-Adaptive Noise Injection for Adversarial Purification Paper • 2509.25082 • Published Sep 29 • 1
MELLA: Bridging Linguistic Capability and Cultural Groundedness for Low-Resource Language MLLMs Paper • 2508.05502 • Published Aug 7 • 6
TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs Paper • 2507.21584 • Published Jul 29 • 10
When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios Paper • 2507.20198 • Published Jul 27 • 26
HoliTom: Holistic Token Merging for Fast Video Large Language Models Paper • 2505.21334 • Published May 27 • 21
Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models Paper • 2503.16257 • Published Mar 20 • 25