Upload ./training.log with huggingface_hub
Browse files- training.log +247 -0
training.log
ADDED
|
@@ -0,0 +1,247 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
2023-11-16 03:28:06,601 ----------------------------------------------------------------------------------------------------
|
| 2 |
+
2023-11-16 03:28:06,603 Model: "SequenceTagger(
|
| 3 |
+
(embeddings): TransformerWordEmbeddings(
|
| 4 |
+
(model): XLMRobertaModel(
|
| 5 |
+
(embeddings): XLMRobertaEmbeddings(
|
| 6 |
+
(word_embeddings): Embedding(250003, 1024)
|
| 7 |
+
(position_embeddings): Embedding(514, 1024, padding_idx=1)
|
| 8 |
+
(token_type_embeddings): Embedding(1, 1024)
|
| 9 |
+
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
|
| 10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
| 11 |
+
)
|
| 12 |
+
(encoder): XLMRobertaEncoder(
|
| 13 |
+
(layer): ModuleList(
|
| 14 |
+
(0-23): 24 x XLMRobertaLayer(
|
| 15 |
+
(attention): XLMRobertaAttention(
|
| 16 |
+
(self): XLMRobertaSelfAttention(
|
| 17 |
+
(query): Linear(in_features=1024, out_features=1024, bias=True)
|
| 18 |
+
(key): Linear(in_features=1024, out_features=1024, bias=True)
|
| 19 |
+
(value): Linear(in_features=1024, out_features=1024, bias=True)
|
| 20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
| 21 |
+
)
|
| 22 |
+
(output): XLMRobertaSelfOutput(
|
| 23 |
+
(dense): Linear(in_features=1024, out_features=1024, bias=True)
|
| 24 |
+
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
|
| 25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
| 26 |
+
)
|
| 27 |
+
)
|
| 28 |
+
(intermediate): XLMRobertaIntermediate(
|
| 29 |
+
(dense): Linear(in_features=1024, out_features=4096, bias=True)
|
| 30 |
+
(intermediate_act_fn): GELUActivation()
|
| 31 |
+
)
|
| 32 |
+
(output): XLMRobertaOutput(
|
| 33 |
+
(dense): Linear(in_features=4096, out_features=1024, bias=True)
|
| 34 |
+
(LayerNorm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
|
| 35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
| 36 |
+
)
|
| 37 |
+
)
|
| 38 |
+
)
|
| 39 |
+
)
|
| 40 |
+
(pooler): XLMRobertaPooler(
|
| 41 |
+
(dense): Linear(in_features=1024, out_features=1024, bias=True)
|
| 42 |
+
(activation): Tanh()
|
| 43 |
+
)
|
| 44 |
+
)
|
| 45 |
+
)
|
| 46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
| 47 |
+
(linear): Linear(in_features=1024, out_features=13, bias=True)
|
| 48 |
+
(loss_function): CrossEntropyLoss()
|
| 49 |
+
)"
|
| 50 |
+
2023-11-16 03:28:06,603 ----------------------------------------------------------------------------------------------------
|
| 51 |
+
2023-11-16 03:28:06,603 MultiCorpus: 30000 train + 10000 dev + 10000 test sentences
|
| 52 |
+
- ColumnCorpus Corpus: 20000 train + 0 dev + 0 test sentences - /root/.flair/datasets/ner_multi_xtreme/en
|
| 53 |
+
- ColumnCorpus Corpus: 10000 train + 10000 dev + 10000 test sentences - /root/.flair/datasets/ner_multi_xtreme/ka
|
| 54 |
+
2023-11-16 03:28:06,603 ----------------------------------------------------------------------------------------------------
|
| 55 |
+
2023-11-16 03:28:06,603 Train: 30000 sentences
|
| 56 |
+
2023-11-16 03:28:06,603 (train_with_dev=False, train_with_test=False)
|
| 57 |
+
2023-11-16 03:28:06,603 ----------------------------------------------------------------------------------------------------
|
| 58 |
+
2023-11-16 03:28:06,604 Training Params:
|
| 59 |
+
2023-11-16 03:28:06,604 - learning_rate: "5e-06"
|
| 60 |
+
2023-11-16 03:28:06,604 - mini_batch_size: "4"
|
| 61 |
+
2023-11-16 03:28:06,604 - max_epochs: "10"
|
| 62 |
+
2023-11-16 03:28:06,604 - shuffle: "True"
|
| 63 |
+
2023-11-16 03:28:06,604 ----------------------------------------------------------------------------------------------------
|
| 64 |
+
2023-11-16 03:28:06,604 Plugins:
|
| 65 |
+
2023-11-16 03:28:06,604 - TensorboardLogger
|
| 66 |
+
2023-11-16 03:28:06,604 - LinearScheduler | warmup_fraction: '0.1'
|
| 67 |
+
2023-11-16 03:28:06,604 ----------------------------------------------------------------------------------------------------
|
| 68 |
+
2023-11-16 03:28:06,604 Final evaluation on model from best epoch (best-model.pt)
|
| 69 |
+
2023-11-16 03:28:06,604 - metric: "('micro avg', 'f1-score')"
|
| 70 |
+
2023-11-16 03:28:06,604 ----------------------------------------------------------------------------------------------------
|
| 71 |
+
2023-11-16 03:28:06,604 Computation:
|
| 72 |
+
2023-11-16 03:28:06,604 - compute on device: cuda:0
|
| 73 |
+
2023-11-16 03:28:06,604 - embedding storage: none
|
| 74 |
+
2023-11-16 03:28:06,604 ----------------------------------------------------------------------------------------------------
|
| 75 |
+
2023-11-16 03:28:06,604 Model training base path: "autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-3"
|
| 76 |
+
2023-11-16 03:28:06,604 ----------------------------------------------------------------------------------------------------
|
| 77 |
+
2023-11-16 03:28:06,604 ----------------------------------------------------------------------------------------------------
|
| 78 |
+
2023-11-16 03:28:06,604 Logging anything other than scalars to TensorBoard is currently not supported.
|
| 79 |
+
2023-11-16 03:29:38,193 epoch 1 - iter 750/7500 - loss 2.70469216 - time (sec): 91.59 - samples/sec: 264.85 - lr: 0.000000 - momentum: 0.000000
|
| 80 |
+
2023-11-16 03:31:09,213 epoch 1 - iter 1500/7500 - loss 2.24893654 - time (sec): 182.61 - samples/sec: 261.81 - lr: 0.000001 - momentum: 0.000000
|
| 81 |
+
2023-11-16 03:32:42,308 epoch 1 - iter 2250/7500 - loss 1.97006153 - time (sec): 275.70 - samples/sec: 260.33 - lr: 0.000001 - momentum: 0.000000
|
| 82 |
+
2023-11-16 03:34:16,815 epoch 1 - iter 3000/7500 - loss 1.72031860 - time (sec): 370.21 - samples/sec: 260.02 - lr: 0.000002 - momentum: 0.000000
|
| 83 |
+
2023-11-16 03:35:50,112 epoch 1 - iter 3750/7500 - loss 1.52308109 - time (sec): 463.51 - samples/sec: 259.42 - lr: 0.000002 - momentum: 0.000000
|
| 84 |
+
2023-11-16 03:37:23,760 epoch 1 - iter 4500/7500 - loss 1.36457847 - time (sec): 557.15 - samples/sec: 259.48 - lr: 0.000003 - momentum: 0.000000
|
| 85 |
+
2023-11-16 03:38:57,168 epoch 1 - iter 5250/7500 - loss 1.24407079 - time (sec): 650.56 - samples/sec: 259.07 - lr: 0.000003 - momentum: 0.000000
|
| 86 |
+
2023-11-16 03:40:28,972 epoch 1 - iter 6000/7500 - loss 1.15260515 - time (sec): 742.37 - samples/sec: 259.75 - lr: 0.000004 - momentum: 0.000000
|
| 87 |
+
2023-11-16 03:42:03,894 epoch 1 - iter 6750/7500 - loss 1.07519645 - time (sec): 837.29 - samples/sec: 258.95 - lr: 0.000004 - momentum: 0.000000
|
| 88 |
+
2023-11-16 03:43:39,060 epoch 1 - iter 7500/7500 - loss 1.01557427 - time (sec): 932.45 - samples/sec: 258.24 - lr: 0.000005 - momentum: 0.000000
|
| 89 |
+
2023-11-16 03:43:39,062 ----------------------------------------------------------------------------------------------------
|
| 90 |
+
2023-11-16 03:43:39,063 EPOCH 1 done: loss 1.0156 - lr: 0.000005
|
| 91 |
+
2023-11-16 03:44:06,229 DEV : loss 0.27559971809387207 - f1-score (micro avg) 0.8152
|
| 92 |
+
2023-11-16 03:44:08,725 saving best model
|
| 93 |
+
2023-11-16 03:44:10,470 ----------------------------------------------------------------------------------------------------
|
| 94 |
+
2023-11-16 03:45:42,474 epoch 2 - iter 750/7500 - loss 0.39106376 - time (sec): 92.00 - samples/sec: 261.03 - lr: 0.000005 - momentum: 0.000000
|
| 95 |
+
2023-11-16 03:47:15,847 epoch 2 - iter 1500/7500 - loss 0.40555598 - time (sec): 185.37 - samples/sec: 261.97 - lr: 0.000005 - momentum: 0.000000
|
| 96 |
+
2023-11-16 03:48:49,533 epoch 2 - iter 2250/7500 - loss 0.40652252 - time (sec): 279.06 - samples/sec: 260.36 - lr: 0.000005 - momentum: 0.000000
|
| 97 |
+
2023-11-16 03:50:24,376 epoch 2 - iter 3000/7500 - loss 0.40712357 - time (sec): 373.90 - samples/sec: 258.58 - lr: 0.000005 - momentum: 0.000000
|
| 98 |
+
2023-11-16 03:52:01,501 epoch 2 - iter 3750/7500 - loss 0.40345429 - time (sec): 471.03 - samples/sec: 256.65 - lr: 0.000005 - momentum: 0.000000
|
| 99 |
+
2023-11-16 03:53:38,242 epoch 2 - iter 4500/7500 - loss 0.40372313 - time (sec): 567.77 - samples/sec: 255.87 - lr: 0.000005 - momentum: 0.000000
|
| 100 |
+
2023-11-16 03:55:11,702 epoch 2 - iter 5250/7500 - loss 0.40504927 - time (sec): 661.23 - samples/sec: 255.50 - lr: 0.000005 - momentum: 0.000000
|
| 101 |
+
2023-11-16 03:56:44,579 epoch 2 - iter 6000/7500 - loss 0.40569421 - time (sec): 754.11 - samples/sec: 256.15 - lr: 0.000005 - momentum: 0.000000
|
| 102 |
+
2023-11-16 03:58:17,886 epoch 2 - iter 6750/7500 - loss 0.40571892 - time (sec): 847.41 - samples/sec: 256.18 - lr: 0.000005 - momentum: 0.000000
|
| 103 |
+
2023-11-16 03:59:50,847 epoch 2 - iter 7500/7500 - loss 0.40365851 - time (sec): 940.37 - samples/sec: 256.06 - lr: 0.000004 - momentum: 0.000000
|
| 104 |
+
2023-11-16 03:59:50,849 ----------------------------------------------------------------------------------------------------
|
| 105 |
+
2023-11-16 03:59:50,849 EPOCH 2 done: loss 0.4037 - lr: 0.000004
|
| 106 |
+
2023-11-16 04:00:17,681 DEV : loss 0.271997332572937 - f1-score (micro avg) 0.8697
|
| 107 |
+
2023-11-16 04:00:20,070 saving best model
|
| 108 |
+
2023-11-16 04:00:23,060 ----------------------------------------------------------------------------------------------------
|
| 109 |
+
2023-11-16 04:01:57,142 epoch 3 - iter 750/7500 - loss 0.34646794 - time (sec): 94.08 - samples/sec: 250.74 - lr: 0.000004 - momentum: 0.000000
|
| 110 |
+
2023-11-16 04:03:32,257 epoch 3 - iter 1500/7500 - loss 0.33277165 - time (sec): 189.19 - samples/sec: 253.91 - lr: 0.000004 - momentum: 0.000000
|
| 111 |
+
2023-11-16 04:05:06,742 epoch 3 - iter 2250/7500 - loss 0.34013081 - time (sec): 283.68 - samples/sec: 253.23 - lr: 0.000004 - momentum: 0.000000
|
| 112 |
+
2023-11-16 04:06:41,133 epoch 3 - iter 3000/7500 - loss 0.33864371 - time (sec): 378.07 - samples/sec: 253.41 - lr: 0.000004 - momentum: 0.000000
|
| 113 |
+
2023-11-16 04:08:14,833 epoch 3 - iter 3750/7500 - loss 0.34190452 - time (sec): 471.77 - samples/sec: 254.37 - lr: 0.000004 - momentum: 0.000000
|
| 114 |
+
2023-11-16 04:09:45,391 epoch 3 - iter 4500/7500 - loss 0.34219639 - time (sec): 562.33 - samples/sec: 256.12 - lr: 0.000004 - momentum: 0.000000
|
| 115 |
+
2023-11-16 04:11:18,334 epoch 3 - iter 5250/7500 - loss 0.34365478 - time (sec): 655.27 - samples/sec: 256.94 - lr: 0.000004 - momentum: 0.000000
|
| 116 |
+
2023-11-16 04:12:52,829 epoch 3 - iter 6000/7500 - loss 0.34431528 - time (sec): 749.76 - samples/sec: 256.24 - lr: 0.000004 - momentum: 0.000000
|
| 117 |
+
2023-11-16 04:14:25,065 epoch 3 - iter 6750/7500 - loss 0.34309773 - time (sec): 842.00 - samples/sec: 257.59 - lr: 0.000004 - momentum: 0.000000
|
| 118 |
+
2023-11-16 04:15:57,201 epoch 3 - iter 7500/7500 - loss 0.34251715 - time (sec): 934.14 - samples/sec: 257.77 - lr: 0.000004 - momentum: 0.000000
|
| 119 |
+
2023-11-16 04:15:57,204 ----------------------------------------------------------------------------------------------------
|
| 120 |
+
2023-11-16 04:15:57,204 EPOCH 3 done: loss 0.3425 - lr: 0.000004
|
| 121 |
+
2023-11-16 04:16:24,728 DEV : loss 0.2714731991291046 - f1-score (micro avg) 0.8842
|
| 122 |
+
2023-11-16 04:16:27,191 saving best model
|
| 123 |
+
2023-11-16 04:16:29,639 ----------------------------------------------------------------------------------------------------
|
| 124 |
+
2023-11-16 04:18:06,042 epoch 4 - iter 750/7500 - loss 0.29074268 - time (sec): 96.40 - samples/sec: 252.40 - lr: 0.000004 - momentum: 0.000000
|
| 125 |
+
2023-11-16 04:19:39,895 epoch 4 - iter 1500/7500 - loss 0.29294947 - time (sec): 190.25 - samples/sec: 256.92 - lr: 0.000004 - momentum: 0.000000
|
| 126 |
+
2023-11-16 04:21:12,116 epoch 4 - iter 2250/7500 - loss 0.29693683 - time (sec): 282.47 - samples/sec: 257.67 - lr: 0.000004 - momentum: 0.000000
|
| 127 |
+
2023-11-16 04:22:43,053 epoch 4 - iter 3000/7500 - loss 0.29670062 - time (sec): 373.41 - samples/sec: 259.41 - lr: 0.000004 - momentum: 0.000000
|
| 128 |
+
2023-11-16 04:24:17,342 epoch 4 - iter 3750/7500 - loss 0.29561519 - time (sec): 467.70 - samples/sec: 257.80 - lr: 0.000004 - momentum: 0.000000
|
| 129 |
+
2023-11-16 04:25:50,373 epoch 4 - iter 4500/7500 - loss 0.29194840 - time (sec): 560.73 - samples/sec: 258.18 - lr: 0.000004 - momentum: 0.000000
|
| 130 |
+
2023-11-16 04:27:24,822 epoch 4 - iter 5250/7500 - loss 0.29857267 - time (sec): 655.18 - samples/sec: 257.96 - lr: 0.000004 - momentum: 0.000000
|
| 131 |
+
2023-11-16 04:28:58,980 epoch 4 - iter 6000/7500 - loss 0.30018714 - time (sec): 749.34 - samples/sec: 257.24 - lr: 0.000003 - momentum: 0.000000
|
| 132 |
+
2023-11-16 04:30:33,294 epoch 4 - iter 6750/7500 - loss 0.30336094 - time (sec): 843.65 - samples/sec: 257.08 - lr: 0.000003 - momentum: 0.000000
|
| 133 |
+
2023-11-16 04:32:07,976 epoch 4 - iter 7500/7500 - loss 0.30240959 - time (sec): 938.33 - samples/sec: 256.62 - lr: 0.000003 - momentum: 0.000000
|
| 134 |
+
2023-11-16 04:32:07,980 ----------------------------------------------------------------------------------------------------
|
| 135 |
+
2023-11-16 04:32:07,980 EPOCH 4 done: loss 0.3024 - lr: 0.000003
|
| 136 |
+
2023-11-16 04:32:35,569 DEV : loss 0.2897871732711792 - f1-score (micro avg) 0.8922
|
| 137 |
+
2023-11-16 04:32:38,075 saving best model
|
| 138 |
+
2023-11-16 04:32:40,983 ----------------------------------------------------------------------------------------------------
|
| 139 |
+
2023-11-16 04:34:14,736 epoch 5 - iter 750/7500 - loss 0.22168761 - time (sec): 93.75 - samples/sec: 260.88 - lr: 0.000003 - momentum: 0.000000
|
| 140 |
+
2023-11-16 04:35:48,275 epoch 5 - iter 1500/7500 - loss 0.23358638 - time (sec): 187.29 - samples/sec: 258.77 - lr: 0.000003 - momentum: 0.000000
|
| 141 |
+
2023-11-16 04:37:23,490 epoch 5 - iter 2250/7500 - loss 0.24130242 - time (sec): 282.50 - samples/sec: 256.72 - lr: 0.000003 - momentum: 0.000000
|
| 142 |
+
2023-11-16 04:38:57,959 epoch 5 - iter 3000/7500 - loss 0.24848714 - time (sec): 376.97 - samples/sec: 257.38 - lr: 0.000003 - momentum: 0.000000
|
| 143 |
+
2023-11-16 04:40:32,648 epoch 5 - iter 3750/7500 - loss 0.25384312 - time (sec): 471.66 - samples/sec: 255.68 - lr: 0.000003 - momentum: 0.000000
|
| 144 |
+
2023-11-16 04:42:08,140 epoch 5 - iter 4500/7500 - loss 0.25352346 - time (sec): 567.15 - samples/sec: 254.83 - lr: 0.000003 - momentum: 0.000000
|
| 145 |
+
2023-11-16 04:43:42,002 epoch 5 - iter 5250/7500 - loss 0.25599881 - time (sec): 661.01 - samples/sec: 255.09 - lr: 0.000003 - momentum: 0.000000
|
| 146 |
+
2023-11-16 04:45:13,434 epoch 5 - iter 6000/7500 - loss 0.25515887 - time (sec): 752.45 - samples/sec: 255.70 - lr: 0.000003 - momentum: 0.000000
|
| 147 |
+
2023-11-16 04:46:47,959 epoch 5 - iter 6750/7500 - loss 0.25539887 - time (sec): 846.97 - samples/sec: 255.87 - lr: 0.000003 - momentum: 0.000000
|
| 148 |
+
2023-11-16 04:48:21,835 epoch 5 - iter 7500/7500 - loss 0.25660205 - time (sec): 940.85 - samples/sec: 255.94 - lr: 0.000003 - momentum: 0.000000
|
| 149 |
+
2023-11-16 04:48:21,838 ----------------------------------------------------------------------------------------------------
|
| 150 |
+
2023-11-16 04:48:21,838 EPOCH 5 done: loss 0.2566 - lr: 0.000003
|
| 151 |
+
2023-11-16 04:48:49,130 DEV : loss 0.28101304173469543 - f1-score (micro avg) 0.8973
|
| 152 |
+
2023-11-16 04:48:51,696 saving best model
|
| 153 |
+
2023-11-16 04:48:53,741 ----------------------------------------------------------------------------------------------------
|
| 154 |
+
2023-11-16 04:50:26,277 epoch 6 - iter 750/7500 - loss 0.22465859 - time (sec): 92.53 - samples/sec: 255.27 - lr: 0.000003 - momentum: 0.000000
|
| 155 |
+
2023-11-16 04:52:00,738 epoch 6 - iter 1500/7500 - loss 0.21970656 - time (sec): 186.99 - samples/sec: 254.83 - lr: 0.000003 - momentum: 0.000000
|
| 156 |
+
2023-11-16 04:53:34,911 epoch 6 - iter 2250/7500 - loss 0.21946764 - time (sec): 281.17 - samples/sec: 255.61 - lr: 0.000003 - momentum: 0.000000
|
| 157 |
+
2023-11-16 04:55:09,828 epoch 6 - iter 3000/7500 - loss 0.21638489 - time (sec): 376.08 - samples/sec: 255.02 - lr: 0.000003 - momentum: 0.000000
|
| 158 |
+
2023-11-16 04:56:43,313 epoch 6 - iter 3750/7500 - loss 0.21414458 - time (sec): 469.57 - samples/sec: 255.78 - lr: 0.000003 - momentum: 0.000000
|
| 159 |
+
2023-11-16 04:58:15,828 epoch 6 - iter 4500/7500 - loss 0.21434532 - time (sec): 562.08 - samples/sec: 256.74 - lr: 0.000002 - momentum: 0.000000
|
| 160 |
+
2023-11-16 04:59:47,824 epoch 6 - iter 5250/7500 - loss 0.21772911 - time (sec): 654.08 - samples/sec: 257.12 - lr: 0.000002 - momentum: 0.000000
|
| 161 |
+
2023-11-16 05:01:19,648 epoch 6 - iter 6000/7500 - loss 0.21657089 - time (sec): 745.90 - samples/sec: 257.68 - lr: 0.000002 - momentum: 0.000000
|
| 162 |
+
2023-11-16 05:02:53,268 epoch 6 - iter 6750/7500 - loss 0.21549326 - time (sec): 839.52 - samples/sec: 257.74 - lr: 0.000002 - momentum: 0.000000
|
| 163 |
+
2023-11-16 05:04:26,550 epoch 6 - iter 7500/7500 - loss 0.21351207 - time (sec): 932.80 - samples/sec: 258.14 - lr: 0.000002 - momentum: 0.000000
|
| 164 |
+
2023-11-16 05:04:26,555 ----------------------------------------------------------------------------------------------------
|
| 165 |
+
2023-11-16 05:04:26,555 EPOCH 6 done: loss 0.2135 - lr: 0.000002
|
| 166 |
+
2023-11-16 05:04:53,798 DEV : loss 0.3079068958759308 - f1-score (micro avg) 0.9002
|
| 167 |
+
2023-11-16 05:04:56,055 saving best model
|
| 168 |
+
2023-11-16 05:04:58,666 ----------------------------------------------------------------------------------------------------
|
| 169 |
+
2023-11-16 05:06:33,192 epoch 7 - iter 750/7500 - loss 0.18268097 - time (sec): 94.52 - samples/sec: 251.77 - lr: 0.000002 - momentum: 0.000000
|
| 170 |
+
2023-11-16 05:08:05,085 epoch 7 - iter 1500/7500 - loss 0.18175139 - time (sec): 186.42 - samples/sec: 257.03 - lr: 0.000002 - momentum: 0.000000
|
| 171 |
+
2023-11-16 05:09:42,400 epoch 7 - iter 2250/7500 - loss 0.19001507 - time (sec): 283.73 - samples/sec: 252.39 - lr: 0.000002 - momentum: 0.000000
|
| 172 |
+
2023-11-16 05:11:16,393 epoch 7 - iter 3000/7500 - loss 0.18641112 - time (sec): 377.72 - samples/sec: 253.16 - lr: 0.000002 - momentum: 0.000000
|
| 173 |
+
2023-11-16 05:12:51,038 epoch 7 - iter 3750/7500 - loss 0.18515279 - time (sec): 472.37 - samples/sec: 253.66 - lr: 0.000002 - momentum: 0.000000
|
| 174 |
+
2023-11-16 05:14:26,806 epoch 7 - iter 4500/7500 - loss 0.18525402 - time (sec): 568.14 - samples/sec: 253.77 - lr: 0.000002 - momentum: 0.000000
|
| 175 |
+
2023-11-16 05:16:01,622 epoch 7 - iter 5250/7500 - loss 0.18863436 - time (sec): 662.95 - samples/sec: 253.50 - lr: 0.000002 - momentum: 0.000000
|
| 176 |
+
2023-11-16 05:17:35,258 epoch 7 - iter 6000/7500 - loss 0.18494686 - time (sec): 756.59 - samples/sec: 253.73 - lr: 0.000002 - momentum: 0.000000
|
| 177 |
+
2023-11-16 05:19:09,299 epoch 7 - iter 6750/7500 - loss 0.18556342 - time (sec): 850.63 - samples/sec: 254.64 - lr: 0.000002 - momentum: 0.000000
|
| 178 |
+
2023-11-16 05:20:43,065 epoch 7 - iter 7500/7500 - loss 0.18644460 - time (sec): 944.40 - samples/sec: 254.97 - lr: 0.000002 - momentum: 0.000000
|
| 179 |
+
2023-11-16 05:20:43,069 ----------------------------------------------------------------------------------------------------
|
| 180 |
+
2023-11-16 05:20:43,069 EPOCH 7 done: loss 0.1864 - lr: 0.000002
|
| 181 |
+
2023-11-16 05:21:10,981 DEV : loss 0.2802160382270813 - f1-score (micro avg) 0.9048
|
| 182 |
+
2023-11-16 05:21:13,241 saving best model
|
| 183 |
+
2023-11-16 05:21:15,612 ----------------------------------------------------------------------------------------------------
|
| 184 |
+
2023-11-16 05:22:49,896 epoch 8 - iter 750/7500 - loss 0.14122739 - time (sec): 94.28 - samples/sec: 259.14 - lr: 0.000002 - momentum: 0.000000
|
| 185 |
+
2023-11-16 05:24:23,137 epoch 8 - iter 1500/7500 - loss 0.14874139 - time (sec): 187.52 - samples/sec: 258.77 - lr: 0.000002 - momentum: 0.000000
|
| 186 |
+
2023-11-16 05:25:56,602 epoch 8 - iter 2250/7500 - loss 0.15341856 - time (sec): 280.99 - samples/sec: 257.70 - lr: 0.000002 - momentum: 0.000000
|
| 187 |
+
2023-11-16 05:27:28,992 epoch 8 - iter 3000/7500 - loss 0.15416389 - time (sec): 373.38 - samples/sec: 258.36 - lr: 0.000001 - momentum: 0.000000
|
| 188 |
+
2023-11-16 05:29:03,053 epoch 8 - iter 3750/7500 - loss 0.15634692 - time (sec): 467.44 - samples/sec: 257.36 - lr: 0.000001 - momentum: 0.000000
|
| 189 |
+
2023-11-16 05:30:37,470 epoch 8 - iter 4500/7500 - loss 0.15700278 - time (sec): 561.85 - samples/sec: 256.31 - lr: 0.000001 - momentum: 0.000000
|
| 190 |
+
2023-11-16 05:32:10,851 epoch 8 - iter 5250/7500 - loss 0.15692674 - time (sec): 655.24 - samples/sec: 256.36 - lr: 0.000001 - momentum: 0.000000
|
| 191 |
+
2023-11-16 05:33:44,315 epoch 8 - iter 6000/7500 - loss 0.15879525 - time (sec): 748.70 - samples/sec: 256.96 - lr: 0.000001 - momentum: 0.000000
|
| 192 |
+
2023-11-16 05:35:16,930 epoch 8 - iter 6750/7500 - loss 0.15726830 - time (sec): 841.31 - samples/sec: 257.53 - lr: 0.000001 - momentum: 0.000000
|
| 193 |
+
2023-11-16 05:36:52,298 epoch 8 - iter 7500/7500 - loss 0.15647824 - time (sec): 936.68 - samples/sec: 257.07 - lr: 0.000001 - momentum: 0.000000
|
| 194 |
+
2023-11-16 05:36:52,300 ----------------------------------------------------------------------------------------------------
|
| 195 |
+
2023-11-16 05:36:52,301 EPOCH 8 done: loss 0.1565 - lr: 0.000001
|
| 196 |
+
2023-11-16 05:37:19,973 DEV : loss 0.3105733096599579 - f1-score (micro avg) 0.9056
|
| 197 |
+
2023-11-16 05:37:21,975 saving best model
|
| 198 |
+
2023-11-16 05:37:24,268 ----------------------------------------------------------------------------------------------------
|
| 199 |
+
2023-11-16 05:38:57,543 epoch 9 - iter 750/7500 - loss 0.13578701 - time (sec): 93.27 - samples/sec: 260.82 - lr: 0.000001 - momentum: 0.000000
|
| 200 |
+
2023-11-16 05:40:30,477 epoch 9 - iter 1500/7500 - loss 0.13977943 - time (sec): 186.21 - samples/sec: 258.36 - lr: 0.000001 - momentum: 0.000000
|
| 201 |
+
2023-11-16 05:42:04,338 epoch 9 - iter 2250/7500 - loss 0.13579281 - time (sec): 280.07 - samples/sec: 257.18 - lr: 0.000001 - momentum: 0.000000
|
| 202 |
+
2023-11-16 05:43:38,134 epoch 9 - iter 3000/7500 - loss 0.13083188 - time (sec): 373.86 - samples/sec: 257.67 - lr: 0.000001 - momentum: 0.000000
|
| 203 |
+
2023-11-16 05:45:11,777 epoch 9 - iter 3750/7500 - loss 0.13761002 - time (sec): 467.51 - samples/sec: 257.61 - lr: 0.000001 - momentum: 0.000000
|
| 204 |
+
2023-11-16 05:46:46,321 epoch 9 - iter 4500/7500 - loss 0.13992387 - time (sec): 562.05 - samples/sec: 256.71 - lr: 0.000001 - momentum: 0.000000
|
| 205 |
+
2023-11-16 05:48:20,030 epoch 9 - iter 5250/7500 - loss 0.13868841 - time (sec): 655.76 - samples/sec: 256.12 - lr: 0.000001 - momentum: 0.000000
|
| 206 |
+
2023-11-16 05:49:52,363 epoch 9 - iter 6000/7500 - loss 0.13924211 - time (sec): 748.09 - samples/sec: 256.89 - lr: 0.000001 - momentum: 0.000000
|
| 207 |
+
2023-11-16 05:51:24,421 epoch 9 - iter 6750/7500 - loss 0.13714285 - time (sec): 840.15 - samples/sec: 257.44 - lr: 0.000001 - momentum: 0.000000
|
| 208 |
+
2023-11-16 05:52:57,141 epoch 9 - iter 7500/7500 - loss 0.13574777 - time (sec): 932.87 - samples/sec: 258.12 - lr: 0.000001 - momentum: 0.000000
|
| 209 |
+
2023-11-16 05:52:57,144 ----------------------------------------------------------------------------------------------------
|
| 210 |
+
2023-11-16 05:52:57,144 EPOCH 9 done: loss 0.1357 - lr: 0.000001
|
| 211 |
+
2023-11-16 05:53:24,122 DEV : loss 0.30354949831962585 - f1-score (micro avg) 0.9069
|
| 212 |
+
2023-11-16 05:53:26,256 saving best model
|
| 213 |
+
2023-11-16 05:53:28,623 ----------------------------------------------------------------------------------------------------
|
| 214 |
+
2023-11-16 05:55:01,065 epoch 10 - iter 750/7500 - loss 0.11662424 - time (sec): 92.44 - samples/sec: 258.78 - lr: 0.000001 - momentum: 0.000000
|
| 215 |
+
2023-11-16 05:56:34,198 epoch 10 - iter 1500/7500 - loss 0.10739844 - time (sec): 185.57 - samples/sec: 260.22 - lr: 0.000000 - momentum: 0.000000
|
| 216 |
+
2023-11-16 05:58:07,104 epoch 10 - iter 2250/7500 - loss 0.11728002 - time (sec): 278.48 - samples/sec: 261.23 - lr: 0.000000 - momentum: 0.000000
|
| 217 |
+
2023-11-16 05:59:38,545 epoch 10 - iter 3000/7500 - loss 0.11111246 - time (sec): 369.92 - samples/sec: 263.47 - lr: 0.000000 - momentum: 0.000000
|
| 218 |
+
2023-11-16 06:01:13,020 epoch 10 - iter 3750/7500 - loss 0.11185424 - time (sec): 464.39 - samples/sec: 261.72 - lr: 0.000000 - momentum: 0.000000
|
| 219 |
+
2023-11-16 06:02:47,617 epoch 10 - iter 4500/7500 - loss 0.11443883 - time (sec): 558.99 - samples/sec: 260.01 - lr: 0.000000 - momentum: 0.000000
|
| 220 |
+
2023-11-16 06:04:22,079 epoch 10 - iter 5250/7500 - loss 0.11684866 - time (sec): 653.45 - samples/sec: 259.18 - lr: 0.000000 - momentum: 0.000000
|
| 221 |
+
2023-11-16 06:05:55,899 epoch 10 - iter 6000/7500 - loss 0.11690532 - time (sec): 747.27 - samples/sec: 258.80 - lr: 0.000000 - momentum: 0.000000
|
| 222 |
+
2023-11-16 06:07:29,604 epoch 10 - iter 6750/7500 - loss 0.11669926 - time (sec): 840.98 - samples/sec: 258.11 - lr: 0.000000 - momentum: 0.000000
|
| 223 |
+
2023-11-16 06:09:02,429 epoch 10 - iter 7500/7500 - loss 0.11723510 - time (sec): 933.80 - samples/sec: 257.87 - lr: 0.000000 - momentum: 0.000000
|
| 224 |
+
2023-11-16 06:09:02,432 ----------------------------------------------------------------------------------------------------
|
| 225 |
+
2023-11-16 06:09:02,432 EPOCH 10 done: loss 0.1172 - lr: 0.000000
|
| 226 |
+
2023-11-16 06:09:29,736 DEV : loss 0.3160940110683441 - f1-score (micro avg) 0.9064
|
| 227 |
+
2023-11-16 06:09:34,595 ----------------------------------------------------------------------------------------------------
|
| 228 |
+
2023-11-16 06:09:34,598 Loading model from best epoch ...
|
| 229 |
+
2023-11-16 06:09:44,517 SequenceTagger predicts: Dictionary with 13 tags: O, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-PER, B-PER, E-PER, I-PER
|
| 230 |
+
2023-11-16 06:10:13,551
|
| 231 |
+
Results:
|
| 232 |
+
- F-score (micro) 0.9076
|
| 233 |
+
- F-score (macro) 0.9067
|
| 234 |
+
- Accuracy 0.8601
|
| 235 |
+
|
| 236 |
+
By class:
|
| 237 |
+
precision recall f1-score support
|
| 238 |
+
|
| 239 |
+
LOC 0.9066 0.9143 0.9105 5288
|
| 240 |
+
PER 0.9231 0.9485 0.9356 3962
|
| 241 |
+
ORG 0.8737 0.8742 0.8739 3807
|
| 242 |
+
|
| 243 |
+
micro avg 0.9022 0.9130 0.9076 13057
|
| 244 |
+
macro avg 0.9012 0.9123 0.9067 13057
|
| 245 |
+
weighted avg 0.9020 0.9130 0.9075 13057
|
| 246 |
+
|
| 247 |
+
2023-11-16 06:10:13,551 ----------------------------------------------------------------------------------------------------
|