Reparameterized LLM Training via Orthogonal Equivalence Transformation Paper • 2506.08001 • Published Jun 9 • 6
Chain-of-Experts: Unlocking the Communication Power of Mixture-of-Experts Models Paper • 2506.18945 • Published Jun 23 • 40