BeyondWeb: Lessons from Scaling Synthetic Data for Trillion-scale Pretraining Paper • 2508.10975 • Published Aug 14 • 59
Understanding Hallucinations in Diffusion Models through Mode Interpolation Paper • 2406.09358 • Published Jun 13, 2024 • 5
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling Paper • 2401.16380 • Published Jan 29, 2024 • 50
TOFU: A Task of Fictitious Unlearning for LLMs Paper • 2401.06121 • Published Jan 11, 2024 • 19