arxiv:2402.17139
Sherry Yang
sherryy
AI & ML interests
None yet
Organizations
None yet
models
10
sherryy/Qwen2-0.5B-GRPO-test
Updated
sherryy/best5-next10-nopizza-nonomad_sft_90
Text Generation
•
8B
•
Updated
•
1
sherryy/pizza_rwr_2k-1k
Text Generation
•
8B
•
Updated
sherryy/pizza_rwr_k10_iter1
Text Generation
•
8B
•
Updated
sherryy/pizza_rwr_iter1
Text Generation
•
8B
•
Updated
sherryy/pizza_rwr_k10
Text Generation
•
8B
•
Updated
sherryy/pizza_rwr
Text Generation
•
8B
•
Updated
sherryy/pizza_sft_90
Text Generation
•
8B
•
Updated
sherryy/pizza_sft
Text Generation
•
8B
•
Updated
•
1
sherryy/math-baseline
Text Generation
•
8B
•
Updated
datasets
14
sherryy/best5-next10-nopizza-nonomad_sft_90
Viewer
•
Updated
•
78.6k
•
2
sherryy/pizza_rwr_k10_iter1
Viewer
•
Updated
•
24.4k
•
3
sherryy/pizza_rwr_iter1
Viewer
•
Updated
•
42.4k
•
5
sherryy/pizza_rwr
Viewer
•
Updated
•
83k
•
3
sherryy/tree_dataset
Viewer
•
Updated
•
11.1k
•
8
sherryy/pizza_sft
Viewer
•
Updated
•
37.8k
•
3
sherryy/pizza_dpo
Viewer
•
Updated
•
5.61k
•
6
sherryy/math12k
Viewer
•
Updated
•
12.5k
•
7
sherryy/random-acts-of-pizza
Viewer
•
Updated
•
59.5k
•
16
sherryy/test_data
Updated
•
2