Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning Paper • 2510.25992 • Published about 1 month ago • 44
ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality Paper • 2510.22037 • Published Oct 24 • 18
CaLM: Contrasting Large and Small Language Models to Verify Grounded Generation Paper • 2406.05365 • Published Jun 8, 2024 • 1