Papers
arXiv:2511.03774

Contamination Detection for VLMs using Multi-Modal Semantic Perturbation

Published on Nov 5
· Submitted by Mu Cai on Nov 7
Authors:
,
,
,
,

Abstract

A novel detection method based on multi-modal semantic perturbation is proposed to identify contaminated Vision-Language Models, demonstrating robustness across various contamination strategies.

AI-generated summary

Recent advances in Vision-Language Models (VLMs) have achieved state-of-the-art performance on numerous benchmark tasks. However, the use of internet-scale, often proprietary, pretraining corpora raises a critical concern for both practitioners and users: inflated performance due to test-set leakage. While prior works have proposed mitigation strategies such as decontamination of pretraining data and benchmark redesign for LLMs, the complementary direction of developing detection methods for contaminated VLMs remains underexplored. To address this gap, we deliberately contaminate open-source VLMs on popular benchmarks and show that existing detection approaches either fail outright or exhibit inconsistent behavior. We then propose a novel simple yet effective detection method based on multi-modal semantic perturbation, demonstrating that contaminated models fail to generalize under controlled perturbations. Finally, we validate our approach across multiple realistic contamination strategies, confirming its robustness and effectiveness. The code and perturbed dataset will be released publicly.

Community

Paper submitter

We introduce Multi-modal Semantic Perturbation, a pipeline to create perturbed benchmarks that can be used to detect data contamination in VLMs. This perturbation pipeline generates image-question pairs with the original image composition intact but modified slightly so that the answer is changed. The perturbed benchmark will have a similar or lower difficulty than the original benchmark, meaning clean models that truly generalize should perform better. However, we discover that contaminated models consistently underperform, showing dramatic performance drops up to -45%. By simply comparing the performance on the two versions of the benchmark, we show that we can reliably detect contaminated VLMs with varying training strategies, number of epochs, model architectures and sizes, while existing approaches fail.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2511.03774 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2511.03774 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2511.03774 in a Space README.md to link it from this page.

Collections including this paper 1