Improve model card: Add tags, paper link, and expanded description

by nielsr HF Staff - opened Oct 11

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+30

-3

nielsr

Oct 11

This PR significantly improves the model card for the Self-Certainty: Llama-3.2-3B-Instruct trained on DAPO-14k model by:

Adding library_name: transformers to the metadata, enabling the automated "how to use" widget. Evidence for this is found in the config.json and tokenizer_config.json files which indicate Llama architecture and transformers_version.
Adding pipeline_tag: text-generation to the metadata, improving discoverability for this large language model.
Updating the model card content to include a direct link to the paper: Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models.
Expanding the model description with details about the Co-rewarding framework from the paper's abstract and the GitHub repository.
Including the citation information.

No sample usage is added as a direct code snippet for inference was not found in the provided GitHub README, adhering to the "Do not make up code yourself" disclaimer.

Please review and merge if these improvements are satisfactory.

Improve model card: Add tags, paper link, and expanded descriptionca25302b

Geraldxm changed pull request status to merged Oct 11

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment