Update Model Card
Browse files
README.md
CHANGED
|
@@ -184,7 +184,7 @@ print(pipeline([{"role": "system", "content": f"detailed thinking {thinking}"},{
|
|
| 184 |
|
| 185 |
A large variety of training data was used for the knowledge distillation phase before post-training pipeline, 3 of which included: FineWeb, Buzz-V1.2, and Dolma.
|
| 186 |
|
| 187 |
-
The data for the multi-stage post-training phases
|
| 188 |
|
| 189 |
Prompts have been sourced from either public and open corpus or synthetically generated. Responses were synthetically generated by a variety of models, with some prompts containing responses for both reasoning on and off modes, to train the model to distinguish between two modes. This model was improved with Qwen.
|
| 190 |
|
|
|
|
| 184 |
|
| 185 |
A large variety of training data was used for the knowledge distillation phase before post-training pipeline, 3 of which included: FineWeb, Buzz-V1.2, and Dolma.
|
| 186 |
|
| 187 |
+
The data for the multi-stage post-training phases is a compilation of SFT and RL data that supports improvements of math, code, general reasoning, and instruction following capabilities of the original Llama instruct model.
|
| 188 |
|
| 189 |
Prompts have been sourced from either public and open corpus or synthetically generated. Responses were synthetically generated by a variety of models, with some prompts containing responses for both reasoning on and off modes, to train the model to distinguish between two modes. This model was improved with Qwen.
|
| 190 |
|