Running 1.12k 1.12k FineWeb: decanting the web for the finest text data at scale 🍷 Generate high-quality text data for LLMs using FineWeb
Running 3.37k 3.37k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
ERNIE 4.5 Collection collection of ERNIE 4.5 models. "-Paddle" models use PaddlePaddle weights, while "-PT" models use Transformer-style PyTorch weights. • 26 items • Updated Sep 24 • 174
view article Article How to generate text: using different decoding methods for language generation with Transformers Mar 1, 2020 • 253
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA May 24, 2023 • 168