Datasets used in the paper
Jiayu (Mila) Wang
MilaWang
AI & ML interests
Large Language Model, Multimodal Large Language Model, Agentic System, Reasoning, Efficiency
Recent Activity
updated a model 37 minutes ago
MilaWang/lirpg-fullparam-qwen2-5-math-7b-answeronly01-handrolled-zeroinit-token-grpo-lrin5e-5-nostd updated a model 38 minutes ago
MilaWang/lirpg-fullparam-qwen2-5-math-7b-answeronly01-handrolled-zeroinit-gn-lrin5e-5-nostd-nokl published a model about 20 hours ago
MilaWang/lirpg-fullparam-qwen2-5-math-7b-answeronly01-handrolled-zeroinit-gn-lrin5e-5-nostd-nokl