stefan-it commited on
Commit
317d410
·
1 Parent(s): f76a0e5

Upload ./training.log with huggingface_hub

Browse files
Files changed (1) hide show
  1. training.log +247 -0
training.log ADDED
@@ -0,0 +1,247 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-11-16 03:28:06,601 ----------------------------------------------------------------------------------------------------
2
+ 2023-11-16 03:28:06,603 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): XLMRobertaModel(
5
+ (embeddings): XLMRobertaEmbeddings(
6
+ (word_embeddings): Embedding(250003, 1024)
7
+ (position_embeddings): Embedding(514, 1024, padding_idx=1)
8
+ (token_type_embeddings): Embedding(1, 1024)
9
+ (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): XLMRobertaEncoder(
13
+ (layer): ModuleList(
14
+ (0-23): 24 x XLMRobertaLayer(
15
+ (attention): XLMRobertaAttention(
16
+ (self): XLMRobertaSelfAttention(
17
+ (query): Linear(in_features=1024, out_features=1024, bias=True)
18
+ (key): Linear(in_features=1024, out_features=1024, bias=True)
19
+ (value): Linear(in_features=1024, out_features=1024, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): XLMRobertaSelfOutput(
23
+ (dense): Linear(in_features=1024, out_features=1024, bias=True)
24
+ (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): XLMRobertaIntermediate(
29
+ (dense): Linear(in_features=1024, out_features=4096, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): XLMRobertaOutput(
33
+ (dense): Linear(in_features=4096, out_features=1024, bias=True)
34
+ (LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): XLMRobertaPooler(
41
+ (dense): Linear(in_features=1024, out_features=1024, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=1024, out_features=13, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-11-16 03:28:06,603 ----------------------------------------------------------------------------------------------------
51
+ 2023-11-16 03:28:06,603 MultiCorpus: 30000 train + 10000 dev + 10000 test sentences
52
+ - ColumnCorpus Corpus: 20000 train + 0 dev + 0 test sentences - /root/.flair/datasets/ner_multi_xtreme/en
53
+ - ColumnCorpus Corpus: 10000 train + 10000 dev + 10000 test sentences - /root/.flair/datasets/ner_multi_xtreme/ka
54
+ 2023-11-16 03:28:06,603 ----------------------------------------------------------------------------------------------------
55
+ 2023-11-16 03:28:06,603 Train: 30000 sentences
56
+ 2023-11-16 03:28:06,603 (train_with_dev=False, train_with_test=False)
57
+ 2023-11-16 03:28:06,603 ----------------------------------------------------------------------------------------------------
58
+ 2023-11-16 03:28:06,604 Training Params:
59
+ 2023-11-16 03:28:06,604 - learning_rate: "5e-06"
60
+ 2023-11-16 03:28:06,604 - mini_batch_size: "4"
61
+ 2023-11-16 03:28:06,604 - max_epochs: "10"
62
+ 2023-11-16 03:28:06,604 - shuffle: "True"
63
+ 2023-11-16 03:28:06,604 ----------------------------------------------------------------------------------------------------
64
+ 2023-11-16 03:28:06,604 Plugins:
65
+ 2023-11-16 03:28:06,604 - TensorboardLogger
66
+ 2023-11-16 03:28:06,604 - LinearScheduler | warmup_fraction: '0.1'
67
+ 2023-11-16 03:28:06,604 ----------------------------------------------------------------------------------------------------
68
+ 2023-11-16 03:28:06,604 Final evaluation on model from best epoch (best-model.pt)
69
+ 2023-11-16 03:28:06,604 - metric: "('micro avg', 'f1-score')"
70
+ 2023-11-16 03:28:06,604 ----------------------------------------------------------------------------------------------------
71
+ 2023-11-16 03:28:06,604 Computation:
72
+ 2023-11-16 03:28:06,604 - compute on device: cuda:0
73
+ 2023-11-16 03:28:06,604 - embedding storage: none
74
+ 2023-11-16 03:28:06,604 ----------------------------------------------------------------------------------------------------
75
+ 2023-11-16 03:28:06,604 Model training base path: "autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-3"
76
+ 2023-11-16 03:28:06,604 ----------------------------------------------------------------------------------------------------
77
+ 2023-11-16 03:28:06,604 ----------------------------------------------------------------------------------------------------
78
+ 2023-11-16 03:28:06,604 Logging anything other than scalars to TensorBoard is currently not supported.
79
+ 2023-11-16 03:29:38,193 epoch 1 - iter 750/7500 - loss 2.70469216 - time (sec): 91.59 - samples/sec: 264.85 - lr: 0.000000 - momentum: 0.000000
80
+ 2023-11-16 03:31:09,213 epoch 1 - iter 1500/7500 - loss 2.24893654 - time (sec): 182.61 - samples/sec: 261.81 - lr: 0.000001 - momentum: 0.000000
81
+ 2023-11-16 03:32:42,308 epoch 1 - iter 2250/7500 - loss 1.97006153 - time (sec): 275.70 - samples/sec: 260.33 - lr: 0.000001 - momentum: 0.000000
82
+ 2023-11-16 03:34:16,815 epoch 1 - iter 3000/7500 - loss 1.72031860 - time (sec): 370.21 - samples/sec: 260.02 - lr: 0.000002 - momentum: 0.000000
83
+ 2023-11-16 03:35:50,112 epoch 1 - iter 3750/7500 - loss 1.52308109 - time (sec): 463.51 - samples/sec: 259.42 - lr: 0.000002 - momentum: 0.000000
84
+ 2023-11-16 03:37:23,760 epoch 1 - iter 4500/7500 - loss 1.36457847 - time (sec): 557.15 - samples/sec: 259.48 - lr: 0.000003 - momentum: 0.000000
85
+ 2023-11-16 03:38:57,168 epoch 1 - iter 5250/7500 - loss 1.24407079 - time (sec): 650.56 - samples/sec: 259.07 - lr: 0.000003 - momentum: 0.000000
86
+ 2023-11-16 03:40:28,972 epoch 1 - iter 6000/7500 - loss 1.15260515 - time (sec): 742.37 - samples/sec: 259.75 - lr: 0.000004 - momentum: 0.000000
87
+ 2023-11-16 03:42:03,894 epoch 1 - iter 6750/7500 - loss 1.07519645 - time (sec): 837.29 - samples/sec: 258.95 - lr: 0.000004 - momentum: 0.000000
88
+ 2023-11-16 03:43:39,060 epoch 1 - iter 7500/7500 - loss 1.01557427 - time (sec): 932.45 - samples/sec: 258.24 - lr: 0.000005 - momentum: 0.000000
89
+ 2023-11-16 03:43:39,062 ----------------------------------------------------------------------------------------------------
90
+ 2023-11-16 03:43:39,063 EPOCH 1 done: loss 1.0156 - lr: 0.000005
91
+ 2023-11-16 03:44:06,229 DEV : loss 0.27559971809387207 - f1-score (micro avg) 0.8152
92
+ 2023-11-16 03:44:08,725 saving best model
93
+ 2023-11-16 03:44:10,470 ----------------------------------------------------------------------------------------------------
94
+ 2023-11-16 03:45:42,474 epoch 2 - iter 750/7500 - loss 0.39106376 - time (sec): 92.00 - samples/sec: 261.03 - lr: 0.000005 - momentum: 0.000000
95
+ 2023-11-16 03:47:15,847 epoch 2 - iter 1500/7500 - loss 0.40555598 - time (sec): 185.37 - samples/sec: 261.97 - lr: 0.000005 - momentum: 0.000000
96
+ 2023-11-16 03:48:49,533 epoch 2 - iter 2250/7500 - loss 0.40652252 - time (sec): 279.06 - samples/sec: 260.36 - lr: 0.000005 - momentum: 0.000000
97
+ 2023-11-16 03:50:24,376 epoch 2 - iter 3000/7500 - loss 0.40712357 - time (sec): 373.90 - samples/sec: 258.58 - lr: 0.000005 - momentum: 0.000000
98
+ 2023-11-16 03:52:01,501 epoch 2 - iter 3750/7500 - loss 0.40345429 - time (sec): 471.03 - samples/sec: 256.65 - lr: 0.000005 - momentum: 0.000000
99
+ 2023-11-16 03:53:38,242 epoch 2 - iter 4500/7500 - loss 0.40372313 - time (sec): 567.77 - samples/sec: 255.87 - lr: 0.000005 - momentum: 0.000000
100
+ 2023-11-16 03:55:11,702 epoch 2 - iter 5250/7500 - loss 0.40504927 - time (sec): 661.23 - samples/sec: 255.50 - lr: 0.000005 - momentum: 0.000000
101
+ 2023-11-16 03:56:44,579 epoch 2 - iter 6000/7500 - loss 0.40569421 - time (sec): 754.11 - samples/sec: 256.15 - lr: 0.000005 - momentum: 0.000000
102
+ 2023-11-16 03:58:17,886 epoch 2 - iter 6750/7500 - loss 0.40571892 - time (sec): 847.41 - samples/sec: 256.18 - lr: 0.000005 - momentum: 0.000000
103
+ 2023-11-16 03:59:50,847 epoch 2 - iter 7500/7500 - loss 0.40365851 - time (sec): 940.37 - samples/sec: 256.06 - lr: 0.000004 - momentum: 0.000000
104
+ 2023-11-16 03:59:50,849 ----------------------------------------------------------------------------------------------------
105
+ 2023-11-16 03:59:50,849 EPOCH 2 done: loss 0.4037 - lr: 0.000004
106
+ 2023-11-16 04:00:17,681 DEV : loss 0.271997332572937 - f1-score (micro avg) 0.8697
107
+ 2023-11-16 04:00:20,070 saving best model
108
+ 2023-11-16 04:00:23,060 ----------------------------------------------------------------------------------------------------
109
+ 2023-11-16 04:01:57,142 epoch 3 - iter 750/7500 - loss 0.34646794 - time (sec): 94.08 - samples/sec: 250.74 - lr: 0.000004 - momentum: 0.000000
110
+ 2023-11-16 04:03:32,257 epoch 3 - iter 1500/7500 - loss 0.33277165 - time (sec): 189.19 - samples/sec: 253.91 - lr: 0.000004 - momentum: 0.000000
111
+ 2023-11-16 04:05:06,742 epoch 3 - iter 2250/7500 - loss 0.34013081 - time (sec): 283.68 - samples/sec: 253.23 - lr: 0.000004 - momentum: 0.000000
112
+ 2023-11-16 04:06:41,133 epoch 3 - iter 3000/7500 - loss 0.33864371 - time (sec): 378.07 - samples/sec: 253.41 - lr: 0.000004 - momentum: 0.000000
113
+ 2023-11-16 04:08:14,833 epoch 3 - iter 3750/7500 - loss 0.34190452 - time (sec): 471.77 - samples/sec: 254.37 - lr: 0.000004 - momentum: 0.000000
114
+ 2023-11-16 04:09:45,391 epoch 3 - iter 4500/7500 - loss 0.34219639 - time (sec): 562.33 - samples/sec: 256.12 - lr: 0.000004 - momentum: 0.000000
115
+ 2023-11-16 04:11:18,334 epoch 3 - iter 5250/7500 - loss 0.34365478 - time (sec): 655.27 - samples/sec: 256.94 - lr: 0.000004 - momentum: 0.000000
116
+ 2023-11-16 04:12:52,829 epoch 3 - iter 6000/7500 - loss 0.34431528 - time (sec): 749.76 - samples/sec: 256.24 - lr: 0.000004 - momentum: 0.000000
117
+ 2023-11-16 04:14:25,065 epoch 3 - iter 6750/7500 - loss 0.34309773 - time (sec): 842.00 - samples/sec: 257.59 - lr: 0.000004 - momentum: 0.000000
118
+ 2023-11-16 04:15:57,201 epoch 3 - iter 7500/7500 - loss 0.34251715 - time (sec): 934.14 - samples/sec: 257.77 - lr: 0.000004 - momentum: 0.000000
119
+ 2023-11-16 04:15:57,204 ----------------------------------------------------------------------------------------------------
120
+ 2023-11-16 04:15:57,204 EPOCH 3 done: loss 0.3425 - lr: 0.000004
121
+ 2023-11-16 04:16:24,728 DEV : loss 0.2714731991291046 - f1-score (micro avg) 0.8842
122
+ 2023-11-16 04:16:27,191 saving best model
123
+ 2023-11-16 04:16:29,639 ----------------------------------------------------------------------------------------------------
124
+ 2023-11-16 04:18:06,042 epoch 4 - iter 750/7500 - loss 0.29074268 - time (sec): 96.40 - samples/sec: 252.40 - lr: 0.000004 - momentum: 0.000000
125
+ 2023-11-16 04:19:39,895 epoch 4 - iter 1500/7500 - loss 0.29294947 - time (sec): 190.25 - samples/sec: 256.92 - lr: 0.000004 - momentum: 0.000000
126
+ 2023-11-16 04:21:12,116 epoch 4 - iter 2250/7500 - loss 0.29693683 - time (sec): 282.47 - samples/sec: 257.67 - lr: 0.000004 - momentum: 0.000000
127
+ 2023-11-16 04:22:43,053 epoch 4 - iter 3000/7500 - loss 0.29670062 - time (sec): 373.41 - samples/sec: 259.41 - lr: 0.000004 - momentum: 0.000000
128
+ 2023-11-16 04:24:17,342 epoch 4 - iter 3750/7500 - loss 0.29561519 - time (sec): 467.70 - samples/sec: 257.80 - lr: 0.000004 - momentum: 0.000000
129
+ 2023-11-16 04:25:50,373 epoch 4 - iter 4500/7500 - loss 0.29194840 - time (sec): 560.73 - samples/sec: 258.18 - lr: 0.000004 - momentum: 0.000000
130
+ 2023-11-16 04:27:24,822 epoch 4 - iter 5250/7500 - loss 0.29857267 - time (sec): 655.18 - samples/sec: 257.96 - lr: 0.000004 - momentum: 0.000000
131
+ 2023-11-16 04:28:58,980 epoch 4 - iter 6000/7500 - loss 0.30018714 - time (sec): 749.34 - samples/sec: 257.24 - lr: 0.000003 - momentum: 0.000000
132
+ 2023-11-16 04:30:33,294 epoch 4 - iter 6750/7500 - loss 0.30336094 - time (sec): 843.65 - samples/sec: 257.08 - lr: 0.000003 - momentum: 0.000000
133
+ 2023-11-16 04:32:07,976 epoch 4 - iter 7500/7500 - loss 0.30240959 - time (sec): 938.33 - samples/sec: 256.62 - lr: 0.000003 - momentum: 0.000000
134
+ 2023-11-16 04:32:07,980 ----------------------------------------------------------------------------------------------------
135
+ 2023-11-16 04:32:07,980 EPOCH 4 done: loss 0.3024 - lr: 0.000003
136
+ 2023-11-16 04:32:35,569 DEV : loss 0.2897871732711792 - f1-score (micro avg) 0.8922
137
+ 2023-11-16 04:32:38,075 saving best model
138
+ 2023-11-16 04:32:40,983 ----------------------------------------------------------------------------------------------------
139
+ 2023-11-16 04:34:14,736 epoch 5 - iter 750/7500 - loss 0.22168761 - time (sec): 93.75 - samples/sec: 260.88 - lr: 0.000003 - momentum: 0.000000
140
+ 2023-11-16 04:35:48,275 epoch 5 - iter 1500/7500 - loss 0.23358638 - time (sec): 187.29 - samples/sec: 258.77 - lr: 0.000003 - momentum: 0.000000
141
+ 2023-11-16 04:37:23,490 epoch 5 - iter 2250/7500 - loss 0.24130242 - time (sec): 282.50 - samples/sec: 256.72 - lr: 0.000003 - momentum: 0.000000
142
+ 2023-11-16 04:38:57,959 epoch 5 - iter 3000/7500 - loss 0.24848714 - time (sec): 376.97 - samples/sec: 257.38 - lr: 0.000003 - momentum: 0.000000
143
+ 2023-11-16 04:40:32,648 epoch 5 - iter 3750/7500 - loss 0.25384312 - time (sec): 471.66 - samples/sec: 255.68 - lr: 0.000003 - momentum: 0.000000
144
+ 2023-11-16 04:42:08,140 epoch 5 - iter 4500/7500 - loss 0.25352346 - time (sec): 567.15 - samples/sec: 254.83 - lr: 0.000003 - momentum: 0.000000
145
+ 2023-11-16 04:43:42,002 epoch 5 - iter 5250/7500 - loss 0.25599881 - time (sec): 661.01 - samples/sec: 255.09 - lr: 0.000003 - momentum: 0.000000
146
+ 2023-11-16 04:45:13,434 epoch 5 - iter 6000/7500 - loss 0.25515887 - time (sec): 752.45 - samples/sec: 255.70 - lr: 0.000003 - momentum: 0.000000
147
+ 2023-11-16 04:46:47,959 epoch 5 - iter 6750/7500 - loss 0.25539887 - time (sec): 846.97 - samples/sec: 255.87 - lr: 0.000003 - momentum: 0.000000
148
+ 2023-11-16 04:48:21,835 epoch 5 - iter 7500/7500 - loss 0.25660205 - time (sec): 940.85 - samples/sec: 255.94 - lr: 0.000003 - momentum: 0.000000
149
+ 2023-11-16 04:48:21,838 ----------------------------------------------------------------------------------------------------
150
+ 2023-11-16 04:48:21,838 EPOCH 5 done: loss 0.2566 - lr: 0.000003
151
+ 2023-11-16 04:48:49,130 DEV : loss 0.28101304173469543 - f1-score (micro avg) 0.8973
152
+ 2023-11-16 04:48:51,696 saving best model
153
+ 2023-11-16 04:48:53,741 ----------------------------------------------------------------------------------------------------
154
+ 2023-11-16 04:50:26,277 epoch 6 - iter 750/7500 - loss 0.22465859 - time (sec): 92.53 - samples/sec: 255.27 - lr: 0.000003 - momentum: 0.000000
155
+ 2023-11-16 04:52:00,738 epoch 6 - iter 1500/7500 - loss 0.21970656 - time (sec): 186.99 - samples/sec: 254.83 - lr: 0.000003 - momentum: 0.000000
156
+ 2023-11-16 04:53:34,911 epoch 6 - iter 2250/7500 - loss 0.21946764 - time (sec): 281.17 - samples/sec: 255.61 - lr: 0.000003 - momentum: 0.000000
157
+ 2023-11-16 04:55:09,828 epoch 6 - iter 3000/7500 - loss 0.21638489 - time (sec): 376.08 - samples/sec: 255.02 - lr: 0.000003 - momentum: 0.000000
158
+ 2023-11-16 04:56:43,313 epoch 6 - iter 3750/7500 - loss 0.21414458 - time (sec): 469.57 - samples/sec: 255.78 - lr: 0.000003 - momentum: 0.000000
159
+ 2023-11-16 04:58:15,828 epoch 6 - iter 4500/7500 - loss 0.21434532 - time (sec): 562.08 - samples/sec: 256.74 - lr: 0.000002 - momentum: 0.000000
160
+ 2023-11-16 04:59:47,824 epoch 6 - iter 5250/7500 - loss 0.21772911 - time (sec): 654.08 - samples/sec: 257.12 - lr: 0.000002 - momentum: 0.000000
161
+ 2023-11-16 05:01:19,648 epoch 6 - iter 6000/7500 - loss 0.21657089 - time (sec): 745.90 - samples/sec: 257.68 - lr: 0.000002 - momentum: 0.000000
162
+ 2023-11-16 05:02:53,268 epoch 6 - iter 6750/7500 - loss 0.21549326 - time (sec): 839.52 - samples/sec: 257.74 - lr: 0.000002 - momentum: 0.000000
163
+ 2023-11-16 05:04:26,550 epoch 6 - iter 7500/7500 - loss 0.21351207 - time (sec): 932.80 - samples/sec: 258.14 - lr: 0.000002 - momentum: 0.000000
164
+ 2023-11-16 05:04:26,555 ----------------------------------------------------------------------------------------------------
165
+ 2023-11-16 05:04:26,555 EPOCH 6 done: loss 0.2135 - lr: 0.000002
166
+ 2023-11-16 05:04:53,798 DEV : loss 0.3079068958759308 - f1-score (micro avg) 0.9002
167
+ 2023-11-16 05:04:56,055 saving best model
168
+ 2023-11-16 05:04:58,666 ----------------------------------------------------------------------------------------------------
169
+ 2023-11-16 05:06:33,192 epoch 7 - iter 750/7500 - loss 0.18268097 - time (sec): 94.52 - samples/sec: 251.77 - lr: 0.000002 - momentum: 0.000000
170
+ 2023-11-16 05:08:05,085 epoch 7 - iter 1500/7500 - loss 0.18175139 - time (sec): 186.42 - samples/sec: 257.03 - lr: 0.000002 - momentum: 0.000000
171
+ 2023-11-16 05:09:42,400 epoch 7 - iter 2250/7500 - loss 0.19001507 - time (sec): 283.73 - samples/sec: 252.39 - lr: 0.000002 - momentum: 0.000000
172
+ 2023-11-16 05:11:16,393 epoch 7 - iter 3000/7500 - loss 0.18641112 - time (sec): 377.72 - samples/sec: 253.16 - lr: 0.000002 - momentum: 0.000000
173
+ 2023-11-16 05:12:51,038 epoch 7 - iter 3750/7500 - loss 0.18515279 - time (sec): 472.37 - samples/sec: 253.66 - lr: 0.000002 - momentum: 0.000000
174
+ 2023-11-16 05:14:26,806 epoch 7 - iter 4500/7500 - loss 0.18525402 - time (sec): 568.14 - samples/sec: 253.77 - lr: 0.000002 - momentum: 0.000000
175
+ 2023-11-16 05:16:01,622 epoch 7 - iter 5250/7500 - loss 0.18863436 - time (sec): 662.95 - samples/sec: 253.50 - lr: 0.000002 - momentum: 0.000000
176
+ 2023-11-16 05:17:35,258 epoch 7 - iter 6000/7500 - loss 0.18494686 - time (sec): 756.59 - samples/sec: 253.73 - lr: 0.000002 - momentum: 0.000000
177
+ 2023-11-16 05:19:09,299 epoch 7 - iter 6750/7500 - loss 0.18556342 - time (sec): 850.63 - samples/sec: 254.64 - lr: 0.000002 - momentum: 0.000000
178
+ 2023-11-16 05:20:43,065 epoch 7 - iter 7500/7500 - loss 0.18644460 - time (sec): 944.40 - samples/sec: 254.97 - lr: 0.000002 - momentum: 0.000000
179
+ 2023-11-16 05:20:43,069 ----------------------------------------------------------------------------------------------------
180
+ 2023-11-16 05:20:43,069 EPOCH 7 done: loss 0.1864 - lr: 0.000002
181
+ 2023-11-16 05:21:10,981 DEV : loss 0.2802160382270813 - f1-score (micro avg) 0.9048
182
+ 2023-11-16 05:21:13,241 saving best model
183
+ 2023-11-16 05:21:15,612 ----------------------------------------------------------------------------------------------------
184
+ 2023-11-16 05:22:49,896 epoch 8 - iter 750/7500 - loss 0.14122739 - time (sec): 94.28 - samples/sec: 259.14 - lr: 0.000002 - momentum: 0.000000
185
+ 2023-11-16 05:24:23,137 epoch 8 - iter 1500/7500 - loss 0.14874139 - time (sec): 187.52 - samples/sec: 258.77 - lr: 0.000002 - momentum: 0.000000
186
+ 2023-11-16 05:25:56,602 epoch 8 - iter 2250/7500 - loss 0.15341856 - time (sec): 280.99 - samples/sec: 257.70 - lr: 0.000002 - momentum: 0.000000
187
+ 2023-11-16 05:27:28,992 epoch 8 - iter 3000/7500 - loss 0.15416389 - time (sec): 373.38 - samples/sec: 258.36 - lr: 0.000001 - momentum: 0.000000
188
+ 2023-11-16 05:29:03,053 epoch 8 - iter 3750/7500 - loss 0.15634692 - time (sec): 467.44 - samples/sec: 257.36 - lr: 0.000001 - momentum: 0.000000
189
+ 2023-11-16 05:30:37,470 epoch 8 - iter 4500/7500 - loss 0.15700278 - time (sec): 561.85 - samples/sec: 256.31 - lr: 0.000001 - momentum: 0.000000
190
+ 2023-11-16 05:32:10,851 epoch 8 - iter 5250/7500 - loss 0.15692674 - time (sec): 655.24 - samples/sec: 256.36 - lr: 0.000001 - momentum: 0.000000
191
+ 2023-11-16 05:33:44,315 epoch 8 - iter 6000/7500 - loss 0.15879525 - time (sec): 748.70 - samples/sec: 256.96 - lr: 0.000001 - momentum: 0.000000
192
+ 2023-11-16 05:35:16,930 epoch 8 - iter 6750/7500 - loss 0.15726830 - time (sec): 841.31 - samples/sec: 257.53 - lr: 0.000001 - momentum: 0.000000
193
+ 2023-11-16 05:36:52,298 epoch 8 - iter 7500/7500 - loss 0.15647824 - time (sec): 936.68 - samples/sec: 257.07 - lr: 0.000001 - momentum: 0.000000
194
+ 2023-11-16 05:36:52,300 ----------------------------------------------------------------------------------------------------
195
+ 2023-11-16 05:36:52,301 EPOCH 8 done: loss 0.1565 - lr: 0.000001
196
+ 2023-11-16 05:37:19,973 DEV : loss 0.3105733096599579 - f1-score (micro avg) 0.9056
197
+ 2023-11-16 05:37:21,975 saving best model
198
+ 2023-11-16 05:37:24,268 ----------------------------------------------------------------------------------------------------
199
+ 2023-11-16 05:38:57,543 epoch 9 - iter 750/7500 - loss 0.13578701 - time (sec): 93.27 - samples/sec: 260.82 - lr: 0.000001 - momentum: 0.000000
200
+ 2023-11-16 05:40:30,477 epoch 9 - iter 1500/7500 - loss 0.13977943 - time (sec): 186.21 - samples/sec: 258.36 - lr: 0.000001 - momentum: 0.000000
201
+ 2023-11-16 05:42:04,338 epoch 9 - iter 2250/7500 - loss 0.13579281 - time (sec): 280.07 - samples/sec: 257.18 - lr: 0.000001 - momentum: 0.000000
202
+ 2023-11-16 05:43:38,134 epoch 9 - iter 3000/7500 - loss 0.13083188 - time (sec): 373.86 - samples/sec: 257.67 - lr: 0.000001 - momentum: 0.000000
203
+ 2023-11-16 05:45:11,777 epoch 9 - iter 3750/7500 - loss 0.13761002 - time (sec): 467.51 - samples/sec: 257.61 - lr: 0.000001 - momentum: 0.000000
204
+ 2023-11-16 05:46:46,321 epoch 9 - iter 4500/7500 - loss 0.13992387 - time (sec): 562.05 - samples/sec: 256.71 - lr: 0.000001 - momentum: 0.000000
205
+ 2023-11-16 05:48:20,030 epoch 9 - iter 5250/7500 - loss 0.13868841 - time (sec): 655.76 - samples/sec: 256.12 - lr: 0.000001 - momentum: 0.000000
206
+ 2023-11-16 05:49:52,363 epoch 9 - iter 6000/7500 - loss 0.13924211 - time (sec): 748.09 - samples/sec: 256.89 - lr: 0.000001 - momentum: 0.000000
207
+ 2023-11-16 05:51:24,421 epoch 9 - iter 6750/7500 - loss 0.13714285 - time (sec): 840.15 - samples/sec: 257.44 - lr: 0.000001 - momentum: 0.000000
208
+ 2023-11-16 05:52:57,141 epoch 9 - iter 7500/7500 - loss 0.13574777 - time (sec): 932.87 - samples/sec: 258.12 - lr: 0.000001 - momentum: 0.000000
209
+ 2023-11-16 05:52:57,144 ----------------------------------------------------------------------------------------------------
210
+ 2023-11-16 05:52:57,144 EPOCH 9 done: loss 0.1357 - lr: 0.000001
211
+ 2023-11-16 05:53:24,122 DEV : loss 0.30354949831962585 - f1-score (micro avg) 0.9069
212
+ 2023-11-16 05:53:26,256 saving best model
213
+ 2023-11-16 05:53:28,623 ----------------------------------------------------------------------------------------------------
214
+ 2023-11-16 05:55:01,065 epoch 10 - iter 750/7500 - loss 0.11662424 - time (sec): 92.44 - samples/sec: 258.78 - lr: 0.000001 - momentum: 0.000000
215
+ 2023-11-16 05:56:34,198 epoch 10 - iter 1500/7500 - loss 0.10739844 - time (sec): 185.57 - samples/sec: 260.22 - lr: 0.000000 - momentum: 0.000000
216
+ 2023-11-16 05:58:07,104 epoch 10 - iter 2250/7500 - loss 0.11728002 - time (sec): 278.48 - samples/sec: 261.23 - lr: 0.000000 - momentum: 0.000000
217
+ 2023-11-16 05:59:38,545 epoch 10 - iter 3000/7500 - loss 0.11111246 - time (sec): 369.92 - samples/sec: 263.47 - lr: 0.000000 - momentum: 0.000000
218
+ 2023-11-16 06:01:13,020 epoch 10 - iter 3750/7500 - loss 0.11185424 - time (sec): 464.39 - samples/sec: 261.72 - lr: 0.000000 - momentum: 0.000000
219
+ 2023-11-16 06:02:47,617 epoch 10 - iter 4500/7500 - loss 0.11443883 - time (sec): 558.99 - samples/sec: 260.01 - lr: 0.000000 - momentum: 0.000000
220
+ 2023-11-16 06:04:22,079 epoch 10 - iter 5250/7500 - loss 0.11684866 - time (sec): 653.45 - samples/sec: 259.18 - lr: 0.000000 - momentum: 0.000000
221
+ 2023-11-16 06:05:55,899 epoch 10 - iter 6000/7500 - loss 0.11690532 - time (sec): 747.27 - samples/sec: 258.80 - lr: 0.000000 - momentum: 0.000000
222
+ 2023-11-16 06:07:29,604 epoch 10 - iter 6750/7500 - loss 0.11669926 - time (sec): 840.98 - samples/sec: 258.11 - lr: 0.000000 - momentum: 0.000000
223
+ 2023-11-16 06:09:02,429 epoch 10 - iter 7500/7500 - loss 0.11723510 - time (sec): 933.80 - samples/sec: 257.87 - lr: 0.000000 - momentum: 0.000000
224
+ 2023-11-16 06:09:02,432 ----------------------------------------------------------------------------------------------------
225
+ 2023-11-16 06:09:02,432 EPOCH 10 done: loss 0.1172 - lr: 0.000000
226
+ 2023-11-16 06:09:29,736 DEV : loss 0.3160940110683441 - f1-score (micro avg) 0.9064
227
+ 2023-11-16 06:09:34,595 ----------------------------------------------------------------------------------------------------
228
+ 2023-11-16 06:09:34,598 Loading model from best epoch ...
229
+ 2023-11-16 06:09:44,517 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-PER, B-PER, E-PER, I-PER
230
+ 2023-11-16 06:10:13,551
231
+ Results:
232
+ - F-score (micro) 0.9076
233
+ - F-score (macro) 0.9067
234
+ - Accuracy 0.8601
235
+
236
+ By class:
237
+ precision recall f1-score support
238
+
239
+ LOC 0.9066 0.9143 0.9105 5288
240
+ PER 0.9231 0.9485 0.9356 3962
241
+ ORG 0.8737 0.8742 0.8739 3807
242
+
243
+ micro avg 0.9022 0.9130 0.9076 13057
244
+ macro avg 0.9012 0.9123 0.9067 13057
245
+ weighted avg 0.9020 0.9130 0.9075 13057
246
+
247
+ 2023-11-16 06:10:13,551 ----------------------------------------------------------------------------------------------------