yangzhch6
/

Qwen2.5-Math-7B-L

Text Generation

text-generation-inference

Model card Files Files and versions

yangzhch6 commited on Sep 30

Commit

981bf16

·

verified ·

1 Parent(s): 9d7ca90

Update README.md

Files changed (1) hide show

README.md +1 -19

README.md CHANGED Viewed

@@ -4,22 +4,4 @@ library_name: transformers
 pipeline_tag: text-generation
 ---
-The base Qwen2.5-Math-7B model used by LUFFY, described in [Learning to Reason under Off-Policy Guidance](https://huggingface.co/papers/2504.14945).
-We change to rope_theta from 10000 to 40000 and extend the context window to 16k.
-Also, we modify the chat_template for the system prompt and add <think>.
-Github: https://github.com/ElliottYan/LUFFY
-# Citation
-If you find our model, data, or evaluation code useful, please kindly cite our paper:
-```bib
-@misc{luffy,
-      title={Learning to Reason under Off-Policy Guidance},
-      author={Jianhao Yan and Yafu Li and Zican Hu and Zhi Wang and Ganqu Cui and Xiaoye Qu and Yu Cheng and Yue Zhang},
-      year={2025},
-      eprint={2504.14945},
-      archivePrefix={arXiv},
-      primaryClass={cs.LG},
-      url={https://arxiv.org/abs/2504.14945},
-}
-```

 pipeline_tag: text-generation
 ---
+Follwoing LUFFY, we change to rope_theta from 10000 to 40000 and extend the context window to 16k.