Improve model card with metadata, paper link, and detailed description

#1
by nielsr HF Staff - opened

This PR improves the model card for the Co-rewarding-II: Llama-3.2-3B-Instruct model by:

  • Linking it to the paper Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models.
  • Adding library_name: transformers to enable the automated code snippet on the model page, based on the config.json file specifying "model_type": "llama" and "transformers_version".
  • Adding pipeline_tag: text-generation for better discoverability, as the model is a large language model focused on reasoning.
  • Expanding the model description with details from the paper's abstract to provide better context about the Co-rewarding framework.
  • The existing license: mit is retained as there's no evidence for a change.
  • A usage example is explicitly not added, as the provided GitHub README snippet's "Quick Start" section appears to be for a different model (pae-llava-7b) and not directly applicable to this Llama-based model.

Please review and merge if everything looks good.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment