view article Article M2.1: Multilingual and Multi-Task Coding with Strong Generalization 5 days ago • 27
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 21 items • Updated about 11 hours ago • 80