Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
jmajkutewicz
's Collections
Evaluation of DPO Configurations
EditPrefs
Evaluation of DPO Configurations
updated
Sep 30
An Empirical Study of DPO Configuration Choices for LLM Alignment
Upvote
-
jmajkutewicz/Llama-3.1-Tulu-3-8B-DPO_hh-rlhf
Text Generation
•
Updated
Sep 26
jmajkutewicz/Llama-3.1-Tulu-3-8B-DPO_oasst1
Text Generation
•
Updated
Sep 26
•
1
jmajkutewicz/Llama-3.1-Tulu-3-8B-DPO_PKU-SafeRLHF
Text Generation
•
Updated
Sep 26
jmajkutewicz/Llama-3.1-Tulu-3-8B-DPO_ultrafeedback
Text Generation
•
Updated
Sep 26
•
1
jmajkutewicz/Llama-3.1-Tulu-3-8B-DPO_dataset-mix
Text Generation
•
Updated
Sep 26
jmajkutewicz/zephyr-7b-dpo_hh-rlhf
Text Generation
•
Updated
Sep 26
•
1
jmajkutewicz/zephyr-7b-dpo_oasst1
Text Generation
•
Updated
Sep 26
•
3
jmajkutewicz/zephyr-7b-dpo_PKU-SafeRLHF
Text Generation
•
Updated
Sep 26
jmajkutewicz/zephyr-7b-dpo_ultrafeedback
Text Generation
•
Updated
Sep 26
jmajkutewicz/zephyr-7b-dpo_dataset-mix
Text Generation
•
Updated
Sep 26
•
1
jmajkutewicz/hh-rlhf-binarized
Viewer
•
Updated
Sep 26
•
169k
•
18
jmajkutewicz/oasst1-binarized
Viewer
•
Updated
Sep 26
•
14.9k
•
13
jmajkutewicz/PKU-SafeRLHF-binarized
Viewer
•
Updated
Sep 26
•
82.1k
•
13
jmajkutewicz/SHP-binarized
Viewer
•
Updated
Sep 26
•
121k
•
14
Upvote
-
Share collection
View history
Collection guide
Browse collections