RePro: Training Language Models to Faithfully Recycle the Web for Pretraining Paper • 2510.10681 • Published Oct 12 • 5 • 2
FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models Paper • 2505.20225 • Published May 26 • 3 • 1