view article Article Mixture of Experts (MoEs) in Transformers +5 ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap • Feb 26 • 164
view article Article Hugging Face to sell open-source robots thanks to Pollen Robotics acquisition 🤖 +1 thomwolf, clem, matthieu-lapeyre • Apr 14, 2025 • 48
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 +1 eliebak, lvwerra, lewtun • Jan 28, 2025 • 889
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization Paper • 2411.02355 • Published Nov 4, 2024 • 52
view article Article Diffusers welcomes Stable Diffusion 3 +4 dn6, YiYiXu, sayakpaul, OzzyGT, kashif, multimodalart • Jun 12, 2024 • 99
🎠Avatars Collection The latest AI-powered technologies usher in a new era of realistic avatars! 🚀 • 75 items • Updated Apr 20, 2025 • 94
view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation +7 yuxiang630, cassanof, ganler, YifengDing, StringChaos, harmdevries, lvwerra, arjunguha, lingming • Apr 29, 2024 • 79
view article Article LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!) wolfram • Apr 24, 2024 • 63
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models Paper • 2403.13372 • Published Mar 20, 2024 • 183