Update README.md
Browse files
README.md
CHANGED
|
@@ -134,15 +134,14 @@ For training data details, please see the [GRAG-SFT-Dataset](https://huggingface
|
|
| 134 |
### Hyperparameters
|
| 135 |
|
| 136 |
|
| 137 |
-
|
|
| 138 |
-
|
| 139 |
-
| warmup steps | 50
|
| 140 |
-
| peak LR | 5.0E-07
|
| 141 |
-
| weight decay | 0.1
|
| 142 |
-
| LR schedule | linear
|
| 143 |
-
| gradient reduce dtype | FP32
|
| 144 |
-
| optimizer state dtype | FP32
|
| 145 |
-
|
| 146 |
|
| 147 |
## Environmental Impact
|
| 148 |
|
|
|
|
| 134 |
### Hyperparameters
|
| 135 |
|
| 136 |
|
| 137 |
+
| Parameter | GRAG-PHI-SFT |
|
| 138 |
+
|---------------------------|--------------------|
|
| 139 |
+
| **warmup steps** | 50 |
|
| 140 |
+
| **peak LR** | 5.0E-07 |
|
| 141 |
+
| **weight decay** | 0.1 |
|
| 142 |
+
| **LR schedule** | linear |
|
| 143 |
+
| **gradient reduce dtype** | FP32 |
|
| 144 |
+
| **optimizer state dtype** | FP32 |
|
|
|
|
| 145 |
|
| 146 |
## Environmental Impact
|
| 147 |
|