Improve model card: Add tags, paper link, and expanded description
#1
by
nielsr
HF Staff
- opened
This PR significantly improves the model card for the Self-Certainty: Llama-3.2-3B-Instruct trained on DAPO-14k model by:
- Adding
library_name: transformersto the metadata, enabling the automated "how to use" widget. Evidence for this is found in theconfig.jsonandtokenizer_config.jsonfiles which indicate Llama architecture andtransformers_version. - Adding
pipeline_tag: text-generationto the metadata, improving discoverability for this large language model. - Updating the model card content to include a direct link to the paper: Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models.
- Expanding the model description with details about the Co-rewarding framework from the paper's abstract and the GitHub repository.
- Including the citation information.
No sample usage is added as a direct code snippet for inference was not found in the provided GitHub README, adhering to the "Do not make up code yourself" disclaimer.
Please review and merge if these improvements are satisfactory.
Geraldxm
changed pull request status to
merged