UttamGupta commited on
Commit
1d0b06b
·
verified ·
1 Parent(s): e9ba7cd

End of training

Browse files
Files changed (3) hide show
  1. README.md +83 -0
  2. generation_config.json +7 -0
  3. model.safetensors +1 -1
README.md ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: google/long-t5-tglobal-base
4
+ tags:
5
+ - generated_from_trainer
6
+ metrics:
7
+ - rouge
8
+ model-index:
9
+ - name: long_t5_test
10
+ results: []
11
+ ---
12
+
13
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
+ should probably proofread and complete it, then remove this comment. -->
15
+
16
+ # long_t5_test
17
+
18
+ This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on the None dataset.
19
+ It achieves the following results on the evaluation set:
20
+ - Loss: 0.7929
21
+ - Rouge1: 0.5445
22
+ - Rouge2: 0.3112
23
+ - Rougel: 0.3469
24
+ - Rougelsum: 0.346
25
+ - Gen Len: 410.5957
26
+
27
+ ## Model description
28
+
29
+ More information needed
30
+
31
+ ## Intended uses & limitations
32
+
33
+ More information needed
34
+
35
+ ## Training and evaluation data
36
+
37
+ More information needed
38
+
39
+ ## Training procedure
40
+
41
+ ### Training hyperparameters
42
+
43
+ The following hyperparameters were used during training:
44
+ - learning_rate: 2e-05
45
+ - train_batch_size: 1
46
+ - eval_batch_size: 1
47
+ - seed: 42
48
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
49
+ - lr_scheduler_type: linear
50
+ - num_epochs: 20
51
+
52
+ ### Training results
53
+
54
+ | Training Loss | Epoch | Step | Gen Len | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
55
+ |:-------------:|:-----:|:-----:|:--------:|:---------------:|:------:|:------:|:------:|:---------:|
56
+ | 1.3932 | 1.0 | 1263 | 483.3617 | 0.9704 | 0.4329 | 0.2051 | 0.273 | 0.2724 |
57
+ | 1.2153 | 2.0 | 2526 | 452.3191 | 0.9323 | 0.4598 | 0.2276 | 0.2861 | 0.2856 |
58
+ | 1.1271 | 3.0 | 3789 | 394.5674 | 0.8961 | 0.4943 | 0.2629 | 0.3176 | 0.3171 |
59
+ | 1.0753 | 4.0 | 5052 | 424.6312 | 0.8926 | 0.4901 | 0.2611 | 0.3147 | 0.3146 |
60
+ | 1.0467 | 5.0 | 6315 | 409.1489 | 0.8780 | 0.504 | 0.2715 | 0.3249 | 0.3249 |
61
+ | 1.0262 | 6.0 | 7578 | 417.8298 | 0.8753 | 0.5117 | 0.2839 | 0.335 | 0.3354 |
62
+ | 1.0023 | 7.0 | 8841 | 416.0993 | 0.8620 | 0.507 | 0.2793 | 0.3288 | 0.3291 |
63
+ | 0.9851 | 8.0 | 10104 | 389.539 | 0.8556 | 0.5178 | 0.2891 | 0.3386 | 0.3382 |
64
+ | 0.9943 | 9.0 | 11367 | 409.2482 | 0.8570 | 0.5248 | 0.292 | 0.3405 | 0.3408 |
65
+ | 0.9463 | 10.0 | 12630 | 396.8511 | 0.7550 | 0.5243 | 0.2906 | 0.3329 | 0.3327 |
66
+ | 0.9385 | 11.0 | 13893 | 0.7894 | 0.5377 | 0.3003 | 0.3442 | 0.3439 | 407.3333 |
67
+ | 0.9157 | 12.0 | 15156 | 0.7918 | 0.5449 | 0.3036 | 0.3424 | 0.342 | 415.4255 |
68
+ | 0.9378 | 13.0 | 16419 | 0.7920 | 0.5332 | 0.2935 | 0.3368 | 0.3365 | 421.4326 |
69
+ | 0.9194 | 14.0 | 17682 | 0.7898 | 0.5509 | 0.3087 | 0.3476 | 0.3474 | 406.3688 |
70
+ | 0.911 | 15.0 | 18945 | 0.7956 | 0.5361 | 0.2991 | 0.3403 | 0.3398 | 415.9362 |
71
+ | 0.8769 | 16.0 | 20208 | 0.7918 | 0.5433 | 0.3058 | 0.3459 | 0.3453 | 414.4184 |
72
+ | 0.8808 | 17.0 | 21471 | 0.7901 | 0.5445 | 0.3085 | 0.3492 | 0.3484 | 400.5177 |
73
+ | 0.8908 | 18.0 | 22734 | 0.7926 | 0.5404 | 0.3043 | 0.3427 | 0.3419 | 404.7801 |
74
+ | 0.8868 | 19.0 | 23997 | 0.7919 | 0.5449 | 0.3104 | 0.3494 | 0.3489 | 407.461 |
75
+ | 0.8868 | 20.0 | 25260 | 0.7929 | 0.5445 | 0.3112 | 0.3469 | 0.346 | 410.5957 |
76
+
77
+
78
+ ### Framework versions
79
+
80
+ - Transformers 4.41.2
81
+ - Pytorch 2.3.0
82
+ - Datasets 3.6.0
83
+ - Tokenizers 0.19.1
generation_config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "decoder_start_token_id": 0,
3
+ "eos_token_id": 1,
4
+ "max_length": 500,
5
+ "pad_token_id": 0,
6
+ "transformers_version": "4.41.2"
7
+ }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:25c6f4f977018ea4ec1fb5a20a1fa93a92bfae31eb782f6f0c16bf4cde1ba1a0
3
  size 1187780840
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8ff6287e9c6bf68c118ab69a3bb5bcbb8cc38350dbba8b7d77a9fad830caa95d
3
  size 1187780840