LaSeR: Reinforcement Learning with Last-Token Self-Rewarding Paper • 2510.14943 • Published 23 days ago • 37
AutoCodeBench: Large Language Models are Automatic Code Benchmark Generators Paper • 2508.09101 • Published Aug 12 • 8