ARENA: Adaptive-Rewarded Evidence Navigation Agent

This is the official model release from our paper:

Effective and Transparent RAG: Adaptive-Reward Reinforcement Learning for Decision Traceability

This model is part of the ARENA framework, which improves the reasoning ability and interpretability of retrieval-augmented generation (RAG) by reinforcement learning with adaptive rewards.

For instructions on how to use the model and more implementation details, please refer to our GitHub repository:

👉 https://github.com/ren258/ARENA

Citation

If you find this work useful, please consider citing our paper:

@article{ren2025effective,
  title={Effective and Transparent RAG: Adaptive-Reward Reinforcement Learning for Decision Traceability},
  author={Ren, Jingyi and Xu, Yekun and Wang, Xiaolong and Li, Weitao and Ma, Weizhi and Liu, Yang},
  journal={arXiv preprint arXiv:2505.13258},
  year={2025}
}

Feel free to reach out via GitHub issues if you encounter any problems or have questions!

Downloads last month: 3

Safetensors

Model size

8B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ren258/ARENA-Llama-8B

Quantizations

2 models

Collection including ren258/ARENA-Llama-8B

ARENA

Collection

test • 2 items • Updated May 19