--- license: llama3.1 datasets: - yahma/alpaca-cleaned base_model: - meta-llama/Llama-3.1-8B-Instruct --- # DataFilter [![arXiv](https://img.shields.io/badge/arXiv-2510.19207-b31b1b.svg)](https://arxiv.org/abs/2510.19207) [![HuggingFace](https://img.shields.io/badge/🤗-Model-yellow)](https://huggingface.co/JoyYizhu/DataFilter) A defense system designed to protect LLM agent systems against prompt injection attacks. DataFilter provides robust protection while maintaining system utility and performance. Codes: https://github.com/yizhu-joy/DataFilter ## Quick Start ### Installation ```bash conda create -n py312vllm python=3.12 conda activate py312vllm pip install vllm pandas 'accelerate>=0.26.0' git clone https://github.com/yizhu-joy/DataFilter.git cd DataFilter ``` ### Run DataFilter Inference demo: ```bash python filter_inference.py ``` ## Citation If you use DataFilter in your research, please cite our paper: ```bibtex @misc {wang2025datafilter, title={Defending Against Prompt Injection with DataFilter}, author={Yizhu Wang and Sizhe Chen and Raghad Alkhudair and Basel Alomair and David Wagner}, year={2025}, eprint={2510.19207}, archivePrefix={arXiv}, primaryClass={cs.CR}, url={https://arxiv.org/abs/2510.19207}, } ```