RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning Paper • 2510.02240 • Published 26 days ago • 17
MANI-Pure: Magnitude-Adaptive Noise Injection for Adversarial Purification Paper • 2509.25082 • Published 29 days ago • 1
MANI-Pure: Magnitude-Adaptive Noise Injection for Adversarial Purification Paper • 2509.25082 • Published 29 days ago • 1
MANI-Pure: Magnitude-Adaptive Noise Injection for Adversarial Purification Paper • 2509.25082 • Published 29 days ago • 1 • 2
Towards Adversarial Robustness via Debiased High-Confidence Logit Alignment Paper • 2408.06079 • Published Aug 12, 2024
Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs Paper • 2501.19164 • Published Jan 31
MELLA: Bridging Linguistic Capability and Cultural Groundedness for Low-Resource Language MLLMs Paper • 2508.05502 • Published Aug 7 • 6
MELLA: Bridging Linguistic Capability and Cultural Groundedness for Low-Resource Language MLLMs Paper • 2508.05502 • Published Aug 7 • 6 • 2
TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs Paper • 2507.21584 • Published Jul 29 • 10
TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs Paper • 2507.21584 • Published Jul 29 • 10
TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs Paper • 2507.21584 • Published Jul 29 • 10 • 2
When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios Paper • 2507.20198 • Published Jul 27 • 26
When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios Paper • 2507.20198 • Published Jul 27 • 26
HoliTom: Holistic Token Merging for Fast Video Large Language Models Paper • 2505.21334 • Published May 27 • 21
Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models Paper • 2503.16257 • Published Mar 20 • 25
llava-hf/llava-onevision-qwen2-7b-ov-hf Image-Text-to-Text • 8B • Updated Jun 18 • 41.1k • 34