UDM-GRPO: Stable and Efficient Group Relative Policy Optimization for Uniform Discrete Diffusion Models Paper • 2604.18518 • Published Apr 20 • 7