GRAM-R^2: Self-Training Generative Foundation Reward Models for Reward Reasoning Paper • 2509.02492 • Published Sep 2 • 1