SRPO: A Cross-Domain Implementation of Large-Scale Reinforcement Learning on LLM Paper • 2504.14286 • Published Apr 19 • 2