Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
bsbarkur
/
Qwen2-2.5B-GRPO-test
like
0
Transformers
TensorBoard
Safetensors
Generated from Trainer
trl
grpo
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Metrics
Training metrics
Community
1
Deploy
Use this model
main
Qwen2-2.5B-GRPO-test
Commit History
Model save
3c7423d
verified
bsbarkur
commited on
Apr 4
Training in progress, step 25
6b8b12e
verified
bsbarkur
commited on
Apr 4
Training in progress, step 20
f5a67fb
verified
bsbarkur
commited on
Apr 4
Training in progress, step 15
17ea688
verified
bsbarkur
commited on
Apr 4
Training in progress, step 10
b7f7a9d
verified
bsbarkur
commited on
Apr 3
Training in progress, step 5
efe58b4
verified
bsbarkur
commited on
Apr 3
initial commit
22ffc7a
verified
bsbarkur
commited on
Apr 2