LaSeR: Reinforcement Learning with Last-Token Self-Rewarding Paper • 2510.14943 • Published 18 days ago • 37