PromptKD: Distilling Student-Friendly Knowledge for Generative Language Models via Prompt Tuning Paper • 2402.12842 • Published Feb 20, 2024
Offline-GRPO Collection Collection of LLMs continually post-trained via offline GRPO to enhance mathematical reasoning capabilities. • 3 items • Updated Aug 7
TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation Paper • 2410.05591 • Published Oct 8, 2024 • 13