--- license: apache-2.0 base_model: meta-llama/Llama-3.2-3B tags: - generated_from_trainer - sft - ultrafeedback datasets: - activeDap/ultrafeedback_chosen language: - en library_name: transformers --- # Llama-3.2-3B Fine-tuned on ultrafeedback_chosen This model is a fine-tuned version of [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) on the [activeDap/ultrafeedback_chosen](https://huggingface.co/datasets/activeDap/ultrafeedback_chosen) dataset. ## Training Results ![Training Loss](loss_plot.png) ### Training Statistics | Metric | Value | |--------|-------| | Total Steps | 808 | | Final Training Loss | 1.4808 | | Min Training Loss | 1.4529 | | Training Runtime | 456.43 seconds | | Samples/Second | 113.23 | ## Training Configuration | Parameter | Value | |-----------|-------| | Base Model | meta-llama/Llama-3.2-3B | | Dataset | activeDap/ultrafeedback_chosen | | Number of Epochs | 1.0 | | Per Device Batch Size | 16 | | Gradient Accumulation Steps | 1 | | Total Batch Size | 64 (4 GPUs) | | Learning Rate | 2e-05 | | LR Scheduler | cosine | | Warmup Ratio | 0.1 | | Max Sequence Length | 512 | | Optimizer | adamw_torch_fused | | Mixed Precision | BF16 | ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "activeDap/Llama-3.2-3B_ultrafeedback_chosen" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name) # Format input with prompt template prompt = "What is machine learning?\nAssistant:" inputs = tokenizer(prompt, return_tensors="pt") # Generate response outputs = model.generate(**inputs, max_new_tokens=100) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` ## Training Framework - **Library:** Transformers + TRL - **Training Type:** Supervised Fine-Tuning (SFT) - **Format:** Prompt-completion with Assistant-only loss ## Citation If you use this model, please cite the original base model and dataset: ```bibtex @misc{ultrafeedback2023, title={UltraFeedback: Boosting Language Models with High-quality Feedback}, author={Ganqu Cui and Lifan Yuan and Ning Ding and others}, year={2023}, eprint={2310.01377}, archivePrefix={arXiv} } ```