ARENA: Adaptive-Rewarded Evidence Navigation Agent

This is the official model release from our paper:

Effective and Transparent RAG: Adaptive-Reward Reinforcement Learning for Decision Traceability

This model is part of the ARENA framework, which improves the reasoning ability and interpretability of retrieval-augmented generation (RAG) by reinforcement learning with adaptive rewards.

For instructions on how to use the model and more implementation details, please refer to our GitHub repository:

๐Ÿ‘‰ https://github.com/ren258/ARENA

Citation

If you find this work useful, please consider citing our paper:

@article{ren2025effective,
  title={Effective and Transparent RAG: Adaptive-Reward Reinforcement Learning for Decision Traceability},
  author={Ren, Jingyi and Xu, Yekun and Wang, Xiaolong and Li, Weitao and Ma, Weizhi and Liu, Yang},
  journal={arXiv preprint arXiv:2505.13258},
  year={2025}
}

Feel free to reach out via GitHub issues if you encounter any problems or have questions!

Downloads last month
3
Safetensors
Model size
8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ren258/ARENA-Llama-8B

Quantizations
2 models

Collection including ren258/ARENA-Llama-8B