huseinzol05 commited on
Commit
ae9c029
·
verified ·
1 Parent(s): 31a9614

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -8,7 +8,7 @@ base_model:
8
  - mesolitica/Malaysian-Qwen2.5-7B-Reasoning-SFT
9
  ---
10
 
11
- # Malaysian Qwen 2.5 7B Instruct Reasoning GRPO
12
 
13
  Online Reinforcement learning using GRPO full parameter on warmup reasoning SFT https://huggingface.co/mesolitica/Malaysian-Qwen2.5-7B-Reasoning-SFT on highly curated Malay Dialect Reasoning dataset.
14
 
 
8
  - mesolitica/Malaysian-Qwen2.5-7B-Reasoning-SFT
9
  ---
10
 
11
+ # Malaysian Qwen 2.5 7B Instruct Dialect Reasoning GRPO
12
 
13
  Online Reinforcement learning using GRPO full parameter on warmup reasoning SFT https://huggingface.co/mesolitica/Malaysian-Qwen2.5-7B-Reasoning-SFT on highly curated Malay Dialect Reasoning dataset.
14