Update README.md
Browse files
README.md
CHANGED
|
@@ -12,11 +12,7 @@ datasets:
|
|
| 12 |
|
| 13 |
# NeuralMarcoro14-7B
|
| 14 |
|
| 15 |
-
This is a DPO fine-tune version of [mlabonne/Marcoro14-7B-slerp](https://huggingface.co/mlabonne/Marcoro14-7B-slerp)
|
| 16 |
-
|
| 17 |
-
This model is a merge of the following models made with [mergekit](https://github.com/cg123/mergekit):
|
| 18 |
-
* [AIDC-ai-business/Marcoroni-7B-v3](https://huggingface.co/AIDC-ai-business/Marcoroni-7B-v3)
|
| 19 |
-
* [EmbeddedLLM/Mistral-7B-Merge-14-v0.1](https://huggingface.co/EmbeddedLLM/Mistral-7B-Merge-14-v0.1)
|
| 20 |
|
| 21 |
## 🏆 Evaluation
|
| 22 |
|
|
@@ -26,26 +22,30 @@ This model is a merge of the following models made with [mergekit](https://githu
|
|
| 26 |
|[NeuralMarcoro14-7B](https://huggingface.co/mlabonne/NeuralMarcoro14-7B)| 44.59| 76.17| 65.94| 46.9| 58.4|
|
| 27 |
|Change | -0.07| -0.07| +1.79| +1.26| +0.73|
|
| 28 |
|
| 29 |
-
## 🧩
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 49 |
|
| 50 |
## 💻 Usage
|
| 51 |
|
|
|
|
| 12 |
|
| 13 |
# NeuralMarcoro14-7B
|
| 14 |
|
| 15 |
+
This is a DPO fine-tune version of [mlabonne/Marcoro14-7B-slerp](https://huggingface.co/mlabonne/Marcoro14-7B-slerp). It improves the performance of the model on Nous benchmark suite (waiting for the results on the Open LLM Benchmark).
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
|
| 17 |
## 🏆 Evaluation
|
| 18 |
|
|
|
|
| 22 |
|[NeuralMarcoro14-7B](https://huggingface.co/mlabonne/NeuralMarcoro14-7B)| 44.59| 76.17| 65.94| 46.9| 58.4|
|
| 23 |
|Change | -0.07| -0.07| +1.79| +1.26| +0.73|
|
| 24 |
|
| 25 |
+
## 🧩 Training hyperparameters
|
| 26 |
+
|
| 27 |
+
**LoRA**:
|
| 28 |
+
* r=16
|
| 29 |
+
* lora_alpha=16
|
| 30 |
+
* lora_dropout=0.05
|
| 31 |
+
* bias="none"
|
| 32 |
+
* task_type="CAUSAL_LM"
|
| 33 |
+
* target_modules=['k_proj', 'gate_proj', 'v_proj', 'up_proj', 'q_proj', 'o_proj', 'down_proj']
|
| 34 |
+
|
| 35 |
+
**Training arguments**:
|
| 36 |
+
* per_device_train_batch_size=4
|
| 37 |
+
* gradient_accumulation_steps=4
|
| 38 |
+
* gradient_checkpointing=True
|
| 39 |
+
* learning_rate=5e-5
|
| 40 |
+
* lr_scheduler_type="cosine"
|
| 41 |
+
* max_steps=200
|
| 42 |
+
* optim="paged_adamw_32bit"
|
| 43 |
+
* warmup_steps=100
|
| 44 |
+
|
| 45 |
+
**DPOTrainer**:
|
| 46 |
+
* beta=0.1
|
| 47 |
+
* max_prompt_length=1024
|
| 48 |
+
* max_length=1536
|
| 49 |
|
| 50 |
## 💻 Usage
|
| 51 |
|