Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2 • 185
YuE: Scaling Open Foundation Models for Long-Form Music Generation Paper • 2503.08638 • Published Mar 11 • 71
HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization Paper • 2503.04598 • Published Mar 6 • 21
Scale-Distribution Decoupling: Enabling Stable and Effective Training of Large Language Models Paper • 2502.15499 • Published Feb 21 • 15
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published Feb 20 • 104
Towards a Unified View of Preference Learning for Large Language Models: A Survey Paper • 2409.02795 • Published Sep 4, 2024 • 72
FuzzCoder: Byte-level Fuzzing Test via Large Language Model Paper • 2409.01944 • Published Sep 3, 2024 • 45
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering Paper • 2408.09174 • Published Aug 17, 2024 • 52