AI & ML interests
None defined yet.
models
33
selfcorrexp2/llama31_ace_1ep
Text Generation
•
8B
•
Updated
•
1
selfcorrexp2/beta01_balanced_dpo_step100
Text Generation
•
8B
•
Updated
•
7
selfcorrexp2/llama3sft_balanced_dpo_step550
Text Generation
•
8B
•
Updated
•
4
selfcorrexp2/type12_70b_step300
Text Generation
•
8B
•
Updated
•
4
selfcorrexp2/type12_math_augmath_beta05_nosftloss_step400
Text Generation
•
8B
•
Updated
•
6
selfcorrexp2/type12_math_augmath_dpo_sftlossbeta05_step400
Text Generation
•
8B
•
Updated
•
6
selfcorrexp2/nosft_llama3sft_dpo_type3_7k_ver2_step100
Text Generation
•
8B
•
Updated
•
7
selfcorrexp2/llama3_sft_more_corr_rr0k_3ep
Text Generation
•
8B
•
Updated
•
9
selfcorrexp2/llama3_sft_less_corr_rr0k_ep3_train_on_reasoning
Text Generation
•
8B
•
Updated
•
4
selfcorrexp2/llama3_sft_balanced_corr_rr0k_ep3_train_on_reasoning
Text Generation
•
8B
•
Updated
•
6
selfcorrexp2/llama31_ace_kumar_testtmp07
Viewer
•
Updated
•
15k
•
13
selfcorrexp2/llama31_ace_kumar_testtmp10
Viewer
•
Updated
•
15k
•
7
selfcorrexp2/balanced_model_as_rm_2prompt
Viewer
•
Updated
•
5k
•
12
•
1
selfcorrexp2/balanced_model_as_rm
Viewer
•
Updated
•
5k
•
9
selfcorrexp2/selfcorrexp2_llama3_openmath_1m_ep1_tmp10_goldrm_labeled
Viewer
•
Updated
•
15k
•
8
selfcorrexp2/HanningZhang_Llama3-sft-more-corr-rr60k-3ep_moredatatmp10_vllmexp3
Viewer
•
Updated
•
15k
•
14
selfcorrexp2/HanningZhang_Llama3-sft-more-corr-rr60k-3ep_moredatatmp10
Viewer
•
Updated
•
15k
•
9
selfcorrexp2/HanningZhang_Llama3-sft-more-corr-rr60k-3ep_moredatatmp10_gold_reward
Viewer
•
Updated
•
15k
•
9
selfcorrexp2/balanced_self_rewarding_rm_labeled_llama3_sft_gen_1round_prompt
Viewer
•
Updated
•
15k
•
9
selfcorrexp2/llama3_sft_more_corr_rr0k_3ep_more_datatmp10_vllmexp3
Viewer
•
Updated
•
15k
•
9