Model Description

This Memory Decoder model is trained on the Biomedical domain and can be adapted to enhance any model in the Qwen2 and Qwen2.5 families.

Paper: Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models

GitHub: https://github.com/LUMIA-Group/MemoryDecoder

Training & Evaluation Data

Biomedical Domain Dataset: mimic_iii_diagnosis_anonymous

Test Split: MemoryDecoder-domain-data

Performance Results

Qwen2 Family

Model Base Model Base + MemDec
Qwen2-0.5B 18.41 3.75
Qwen2-1.5B 12.42 3.68
Qwen2-7B 8.36 3.59
Qwen2-72B 6.15 3.45

Qwen2.5 Family

Model Base Model Base + MemDec
Qwen2.5-0.5B 17.01 3.74
Qwen2.5-1.5B 11.33 3.67
Qwen2.5-3B 9.70 3.63
Qwen2.5-7B 8.19 3.57
Qwen2.5-14B 7.01 3.51
Qwen2.5-32B 6.65 3.48
Qwen2.5-72B 5.90 3.44

Perplexity scores on Biomedical domain test set. Lower is better.

Citation

@article{cao2025memory,
  title={Memory decoder: A pretrained, plug-and-play memory for large language models},
  author={Cao, Jiaqi and Wang, Jiarui and Wei, Rubin and Guo, Qipeng and Chen, Kai and Zhou, Bowen and Lin, Zhouhan},
  journal={arXiv preprint arXiv:2508.09874},
  year={2025}
}

Contact

For questions and support: [email protected]

Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Clover-Hill/MemoryDecoder-Qwen-biomed

Base model

Qwen/Qwen2.5-0.5B
Finetuned
(480)
this model

Collection including Clover-Hill/MemoryDecoder-Qwen-biomed