End of training

Browse files

Files changed (3) hide show

README.md +40 -6
model-00001-of-00002.safetensors +1 -1
model-00002-of-00002.safetensors +1 -1

README.md CHANGED Viewed

@@ -22,7 +22,7 @@ model-index:
     metrics:
     - name: Accuracy
       type: accuracy
-      value: 0.5140562248995983
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -32,8 +32,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [t5-3b](https://huggingface.co/t5-3b) on the glue dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6984
-- Accuracy: 0.5141
 ## Model description
@@ -53,19 +53,53 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
-- train_batch_size: 8
-- eval_batch_size: 16
 - seed: 1
 - distributed_type: multi-GPU
 - gradient_accumulation_steps: 2
 - total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 20
-- training_steps: 10
 ### Training results
 ### Framework versions

     metrics:
     - name: Accuracy
       type: accuracy
+      value: 0.8875502008032129
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 This model is a fine-tuned version of [t5-3b](https://huggingface.co/t5-3b) on the glue dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.4268
+- Accuracy: 0.8876
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
+- train_batch_size: 4
+- eval_batch_size: 8
 - seed: 1
 - distributed_type: multi-GPU
+- num_devices: 2
 - gradient_accumulation_steps: 2
 - total_train_batch_size: 16
+- total_eval_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 20
+- training_steps: 750
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 0.7306        | 0.18  | 25   | 0.6913          | 0.5921   |
+| 0.6717        | 0.36  | 50   | 0.4976          | 0.8339   |
+| 0.3978        | 0.53  | 75   | 0.5226          | 0.8628   |
+| 0.322         | 0.71  | 100  | 0.3902          | 0.8484   |
+| 0.2958        | 0.89  | 125  | 0.3803          | 0.8881   |
+| 0.2604        | 1.07  | 150  | 0.8628          | 0.8736   |
+| 0.2011        | 1.25  | 175  | 0.7780          | 0.8953   |
+| 0.263         | 1.42  | 200  | 2.1533          | 0.8881   |
+| 0.2032        | 1.6   | 225  | 4.7955          | 0.8917   |
+| 0.2536        | 1.78  | 250  | 1.7810          | 0.8989   |
+| 0.1984        | 1.96  | 275  | 0.5119          | 0.8845   |
+| 0.1495        | 2.14  | 300  | 0.5128          | 0.8845   |
+| 0.1275        | 2.31  | 325  | 0.8602          | 0.8628   |
+| 0.0955        | 2.49  | 350  | 1.3642          | 0.8773   |
+| 0.3912        | 2.67  | 375  | 1.0186          | 0.8664   |
+| 0.1108        | 2.85  | 400  | 2.1450          | 0.8592   |
+| 0.0726        | 3.02  | 425  | 2.6801          | 0.8809   |
+| 0.0937        | 3.2   | 450  | 5.2053          | 0.8736   |
+| 1.0143        | 3.38  | 475  | 3.3979          | 0.8845   |
+| 0.5754        | 3.56  | 500  | 4.2786          | 0.8989   |
+| 0.2928        | 3.74  | 525  | 5.6543          | 0.8917   |
+| 0.5633        | 3.91  | 550  | 6.7064          | 0.8845   |
+| 1.0431        | 4.09  | 575  | 4.9205          | 0.8953   |
+| 0.2839        | 4.27  | 600  | 4.2344          | 0.8809   |
+| 0.5464        | 4.45  | 625  | 4.9598          | 0.8809   |
+| 0.0031        | 4.63  | 650  | 5.3705          | 0.8881   |
+| 0.5149        | 4.8   | 675  | 4.8105          | 0.8845   |
+| 0.2702        | 4.98  | 700  | 6.9958          | 0.8953   |
+| 0.7503        | 5.16  | 725  | 5.4360          | 0.8881   |
+| 0.2639        | 5.34  | 750  | 5.4420          | 0.8917   |
 ### Framework versions

model-00001-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:cc997526ec5af386ad44532e723d8c6872ec73f950f9ce2bccbf3caad559a42a
 size 4998578592

 version https://git-lfs.github.com/spec/v1
+oid sha256:828e62612b4cffa2467ca73ca6cc096e04e92145b3652c804148eef0a1e4c8f2
 size 4998578592

model-00002-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b20323baec700a9f33a51031b0b313c9dd3afb791decfbb0c1c279165726d797
 size 706803048

 version https://git-lfs.github.com/spec/v1
+oid sha256:0414a11f91dce242ced67dc4b0f38bf29f82a8d60bf8c48d3eb98d4f58ee24df
 size 706803048