SFT/RLHF
Collection
2 items
•
Updated
A DPO qLORA finetune of Mistral Nemo 12b on four Gutenberg datasets plus one more dataset, approx ~9k lines.
Finetuned for 1 epoch on an A100 through Vast.AI.
Thank you to Axolotl for making finetuning easier. Thank you to Docker for... existing, I guess.
intervitens/mini-magnum-12b-v1.1You know, I am REALLY regretting panic-naming this line of models so ambiguously now. Well, too late now!
Base model
intervitens/mini-magnum-12b-v1.1