Checkpoints "Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning" arxiv [2509.22601]
Yulei Qin
yolay
AI & ML interests
Medical Imaging, Computer Vision,
Language Models
Recent Activity
authored
a paper
1 day ago
LTD-Bench: Evaluating Large Language Models by Letting Them Draw
upvoted
a
paper
1 day ago
Agent Lightning: Train ANY AI Agents with Reinforcement Learning