Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
t14n
/
Qwen2-0.5B-GRPO-test
like
0
Transformers
TensorBoard
Safetensors
AI-MO/NuminaMath-TIR
Generated from Trainer
trl
grpo
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Metrics
Training metrics
Community
Deploy
Use this model
main
Qwen2-0.5B-GRPO-test
Commit History
End of training
e38103b
verified
t14n
commited on
May 31
Model save
3cf9ac5
verified
t14n
commited on
May 31
Training in progress, step 113
8e47f64
verified
t14n
commited on
May 31
Training in progress, step 110
bd5191d
verified
t14n
commited on
May 31
Training in progress, step 100
6f8119b
verified
t14n
commited on
May 31
Training in progress, step 90
964fb7c
verified
t14n
commited on
May 31
Training in progress, step 80
f913866
verified
t14n
commited on
May 31
Training in progress, step 70
fbb2faa
verified
t14n
commited on
May 31
Training in progress, step 60
c8ad59e
verified
t14n
commited on
May 31
Training in progress, step 50
7a74d13
verified
t14n
commited on
May 31
Training in progress, step 40
f8a1d73
verified
t14n
commited on
May 31
Training in progress, step 30
e8a0653
verified
t14n
commited on
May 31
Training in progress, step 20
7bc84b3
verified
t14n
commited on
May 31
Training in progress, step 10
7ffc3b9
verified
t14n
commited on
May 31
initial commit
fc85d63
verified
t14n
commited on
May 31