view article Article The 1 Billion Token Challenge: Finding the Perfect Pre-training Mix 27 days ago • 46
Running 3.52k The Ultra-Scale Playbook 🌌 3.52k The ultimate guide to training LLM on large GPU Clusters
Running Featured 1.19k FineWeb: decanting the web for the finest text data at scale 🍷 1.19k Generate high-quality text data for LLMs using FineWeb
Running on CPU Upgrade Featured 2.47k The Smol Training Playbook 📚 2.47k The secrets to building world-class LLMs
DynaVis: Dynamically Synthesized UI Widgets for Visualization Editing Paper • 2401.10880 • Published Jan 19, 2024 • 1
SliceGPT: Compress Large Language Models by Deleting Rows and Columns Paper • 2401.15024 • Published Jan 26, 2024 • 74