Improve model card with metadata, paper link, and detailed description

by nielsr HF Staff - opened 27 days ago

←

nielsr

27 days ago

This PR improves the model card for the Co-rewarding-II: Llama-3.2-3B-Instruct model by:

Linking it to the paper Co-rewarding: Stable Self-supervised RL for Eliciting Reasoning in Large Language Models.
Adding library_name: transformers to enable the automated code snippet on the model page, based on the config.json file specifying "model_type": "llama" and "transformers_version".
Adding pipeline_tag: text-generation for better discoverability, as the model is a large language model focused on reasoning.
Expanding the model description with details from the paper's abstract to provide better context about the Co-rewarding framework.
The existing license: mit is retained as there's no evidence for a change.
A usage example is explicitly not added, as the provided GitHub README snippet's "Quick Start" section appears to be for a different model (pae-llava-7b) and not directly applicable to this Llama-based model.

Please review and merge if everything looks good.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment