igorktech/TALANT_saiga_7b_DPO
Text Generation
•
7B
•
Updated
•
9
Experimental summarization LLMs trained on habr and aligned with DPO on weak generations from T5 and gold summaries.