·
AI & ML interests
None yet
Organizations
morizon/qwen3-4b-agent-trajectory-lora_0223_run_1
Text Generation
• 4B • Updated morizon/toy-kitchen-grpo-qwen3-1_7b-v1_20260222_run_1
Updated
morizon/toy-kitchen-grpo-Qwen3-1.7B
Text Generation
• 2B • Updated • 2
morizon/wordle-grpo-Qwen3-1.7B-0217_run_1
Text Generation
• 2B • Updated • 2
morizon/wordle-grpo-Qwen3-1.7B-test
Text Generation
• 2B • Updated • 2
morizon/qwen3-4b-structured-output-lora_0208_run_1
Text Generation
• Updated • 1
morizon/qwen3-4b-structured-output-lora_0208_run_2
Text Generation
• Updated • 1
morizon/qwen3-4b-structured-output-lora_0207_run_1
Text Generation
• Updated morizon/qwen3-4b-structured-output-lora_0206_run_3
Text Generation
• Updated • 1
morizon/qwen3-4b-structured-output-lora_0206_run_2
Text Generation
• Updated • 2
morizon/qwen3-4b-structured-output-lora_0206_run_1
Text Generation
• Updated • 2
morizon/dpo-qwen-cot-merged_0203_run_1
Text Generation
• 4B • Updated • 1
morizon/qwen3-4b-structured-output-lora_0203_run_1
Text Generation
• Updated • 1
morizon/llm-jp-3.1-1.8b_tcga_sft_n100_step30_lora
morizon/llm-jp-3.1-1.8b_tcga_sft_n100_step30
Text Generation
• 2B • Updated • 1
morizon/llm-jp-3.1-1.8b_tcga_sft_step200_lora
Updated
morizon/llm-jp-3.1-1.8b_tcga_sft_step200
Text Generation
• Updated • 2
morizon/Qwen3-4B-gspo-DAPO-Math_1027_run_3
Text Generation
• 4B • Updated • 2
morizon/Qwen3-4B-gspo-DAPO-Math_1027_run_3_lora
4B • Updated morizon/llm-jp-3-13b-instruct2-grpo-R1-0225_std_step3000_lora
Updated
morizon/llm-jp-3-13b-instruct2-grpo-R1-0225_std_step3000
Text Generation
• 14B • Updated • 3
morizon/llm-jp-3-13b-instruct2-grpo-0222_lora_step2000
Updated
morizon/llm-jp-3-13b-instruct2-grpo-0222_step2000
Text Generation
• 14B • Updated • 2
morizon/llm-jp-3-13b-instruct2-grpo-R1-0223_lora_step1600
Updated
morizon/llm-jp-3-13b-instruct2-grpo-R1-0223_step1600
Text Generation
• 14B • Updated • 2
morizon/llm-jp-3-13b-instruct2-grpo-R1-0223_lora_step800
Updated
morizon/llm-jp-3-13b-instruct2-grpo-R1-0223_step800
Text Generation
• 14B • Updated • 1
morizon/llm-jp-3-13b-instruct2-grpo-0222_lora_step1000
morizon/llm-jp-3-13b-instruct2-grpo-0222_step1000
Text Generation
• 14B • Updated • 2
morizon/llm-jp-3-13b-instruct2-grpo-MATH-lighteval_step1000_lora