-
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 28 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 138 -
Rethinking Interpretability in the Era of Large Language Models
Paper • 2402.01761 • Published • 23 -
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper • 2412.13663 • Published • 158
Michel Chaduteau
michadu
·
AI & ML interests
None yet
Organizations
None yet
LLM_papers
-
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 28 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 138 -
Rethinking Interpretability in the Era of Large Language Models
Paper • 2402.01761 • Published • 23 -
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper • 2412.13663 • Published • 158
models
0
None public yet
datasets
0
None public yet