Improve model card with metadata, paper link, and detailed description
#1
by
nielsr
HF Staff
- opened
This PR improves the model card for the Co-rewarding-II: Llama-3.2-3B-Instruct model by:
- Linking it to the paper Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models.
- Adding
library_name: transformersto enable the automated code snippet on the model page, based on theconfig.jsonfile specifying"model_type": "llama"and"transformers_version". - Adding
pipeline_tag: text-generationfor better discoverability, as the model is a large language model focused on reasoning. - Expanding the model description with details from the paper's abstract to provide better context about the Co-rewarding framework.
- The existing
license: mitis retained as there's no evidence for a change. - A usage example is explicitly not added, as the provided GitHub README snippet's "Quick Start" section appears to be for a different model (
pae-llava-7b) and not directly applicable to this Llama-based model.
Please review and merge if everything looks good.