lazarustda (Thomas Lazarus)

upvoted an article 3 months ago

Article

Mixture of Experts (MoEs) in Transformers

+5

ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap

•

Feb 26

• 164

upvoted a paper 8 months ago

Soft Tokens, Hard Truths

Paper • 2509.19170 • Published Sep 23, 2025 • 16

upvoted a paper about 1 year ago

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 341

upvoted an article about 1 year ago

Article

Hugging Face to sell open-source robots thanks to Pollen Robotics acquisition 🤖

+1

thomwolf, clem, matthieu-lapeyre

•

Apr 14, 2025

• 48

upvoted a paper about 1 year ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31, 2025 • 126

upvoted an article about 1 year ago

Article

Open R1: Update #3

open-r1

•

Mar 11, 2025

• 297

upvoted 2 articles over 1 year ago

Article

Open-R1: Update #1

open-r1

•

Feb 2, 2025

• 305

Article

Open-R1: a fully open reproduction of DeepSeek-R1

+1

eliebak, lvwerra, lewtun

•

Jan 28, 2025

• 889

upvoted a paper over 1 year ago

"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization

Paper • 2411.02355 • Published Nov 4, 2024 • 52

upvoted 2 articles almost 2 years ago

Article

Inference for PROs

+1

osanseviero, pcuenq, victor

•

Sep 22, 2023

• 55

Article

Diffusers welcomes Stable Diffusion 3

+4

dn6, YiYiXu, sayakpaul, OzzyGT, kashif, multimodalart

•

Jun 12, 2024

• 99

upvoted a collection about 2 years ago

🎭 Avatars

Collection

The latest AI-powered technologies usher in a new era of realistic avatars! 🚀 • 75 items • Updated Apr 20, 2025 • 94

upvoted an article about 2 years ago

Article

StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation

+7

yuxiang630, cassanof, ganler, YifengDing, StringChaos, harmdevries, lvwerra, arjunguha, lingming

•

Apr 29, 2024

• 79

upvoted a collection about 2 years ago

OpenELM Instruct Models

Collection

4 items • Updated Aug 25, 2025 • 126

upvoted an article about 2 years ago

Article

LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!)

wolfram

•

Apr 24, 2024

• 63

upvoted a paper about 2 years ago

LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

Paper • 2403.13372 • Published Mar 20, 2024 • 183

Thomas Lazarus

AI & ML interests