Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
honggen
/
hard_dpo
like
0
Text Generation
Anthropic/hh-rlhf
English
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
The reference model after supervised fine-tuning on the chosen response.
Downloads last month
-
Downloads are not tracked for this model.
How to track
Inference Providers
NEW
Text Generation
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Dataset used to train
honggen/hard_dpo
Anthropic/hh-rlhf
Viewer
•
Updated
May 26, 2023
•
169k
•
26.6k
•
1.51k